Case Study 2 — "Stores Near Me," Done Wrong Then Right with PostGIS
Geographic queries look simple ("find stores within 5 km") and are deceptively hard with plain numeric columns. A team's hand-rolled distance math was slow and wrong; PostGIS — a PostgreSQL extension — made it fast and correct.
Background
A retailer's app had a "stores near me" feature. Stores had latitude and longitude columns (plain numeric). To find stores within 5 km of a user, a developer wrote distance math in SQL using the coordinates:
-- ❌ naive "distance" with raw lat/long arithmetic
SELECT name FROM stores
WHERE sqrt(power(latitude - :user_lat, 2) + power(longitude - :user_lon, 2)) < 0.045;
-- "0.045 degrees ≈ 5 km"... sort of?
It seemed to work near the office. In production it was both slow and wrong.
What went wrong
Wrong: Latitude/longitude are angles on a sphere, not flat-plane coordinates. Euclidean distance on raw degrees is geographically incorrect — and the error grows with latitude. One degree of longitude is ~111 km at the equator but shrinks toward the poles (to zero at the poles), while one degree of latitude is ~111 km everywhere. So "0.045 degrees" is a different real distance depending on where you are and which direction — the radius was an ellipse that distorted badly away from the equator. Users in northern cities got wrong results (stores that were actually far were "near," and vice versa). Real geographic distance requires spherical (great-circle / haversine) math, not flat-plane Pythagoras.
Slow: the expression computed sqrt/power on every store row (a full scan) — a B-tree index on latitude or longitude couldn't help a 2-D radius query. At thousands of stores it was tolerable; at scale it wouldn't be.
The fix: PostGIS
The team enabled PostGIS, the PostgreSQL spatial extension, which provides proper geographic types, correct spherical distance functions, and spatial indexes:
CREATE EXTENSION postgis;
-- store a proper geography point (not two loose numerics)
ALTER TABLE stores ADD COLUMN location geography(Point, 4326);
UPDATE stores SET location = ST_MakePoint(longitude, latitude)::geography;
CREATE INDEX ON stores USING gist (location); -- spatial (GiST) index
-- correct AND fast: stores within 5 km, true geographic distance
SELECT name, ST_Distance(location, ST_MakePoint(:user_lon, :user_lat)::geography) AS meters
FROM stores
WHERE ST_DWithin(location, ST_MakePoint(:user_lon, :user_lat)::geography, 5000) -- 5000 m
ORDER BY meters
LIMIT 20;
ST_DWithin(..., 5000) returns stores within a true 5 km (correct spherical distance, in meters, anywhere on Earth), and the GiST spatial index makes it fast by pruning to the relevant geographic region instead of scanning every row. ST_Distance gives the actual distance for sorting. Results were now correct worldwide and fast at any scale.
PostGIS also unlocked features the naive approach couldn't dream of: "which delivery polygon contains this address?" (ST_Contains), route distances, nearest-store ordering, and more — all standard spatial operations.
The analysis
-
Geographic data needs spatial types, functions, and indexes. Lat/long are angles on a sphere; flat-plane arithmetic is wrong (distorting with latitude/direction) and unindexable for radius queries. PostGIS provides the correct geography type, spherical distance, and GiST spatial indexes.
-
ST_DWithin+ a GiST index is correct and fast. Correct because it uses true geographic distance; fast because the spatial index prunes to a region instead of scanning all rows (the spatial analog of Chapter 23's indexing lesson). -
PostGIS is a PostgreSQL extension (theme #4). Spatial work didn't require a separate GIS database — PostGIS made PostgreSQL the spatial database, keeping store/geographic data alongside the rest with SQL and joins. PostGIS is so capable it's the de facto spatial-SQL standard and often the reason teams pick PostgreSQL.
-
"It worked near the office" is the small-data/local trap. The naive math seemed fine at low latitudes and small data — and was wrong elsewhere and slow at scale. Specialized data hides errors until you test at real scope (geographic spread, data volume).
-
Use the right type, don't reinvent it. Hand-rolling distance math (or storing points as two numerics) reinvents — badly — what PostGIS does correctly and quickly. For specialized data, reach for the type/extension built for it.
Discussion questions
- Why is flat-plane (Euclidean) distance on raw lat/long geographically wrong, and why does the error grow with latitude?
- Why couldn't a B-tree index on
latitude/longitudespeed the radius query, and what index does? - How does
ST_DWithin+ a GiST index make the query both correct and fast? - Why didn't the team need a separate GIS database (theme #4)?
- ⭐ Name two spatial questions (beyond radius) that PostGIS makes easy and the naive approach can't handle.