PostGIS: Geometry vs Geography
While debbugging an issue with a search query to find hotels in an area around a specific location, I found a problem with the distance calculation that was used.
Long story short: RTFM
The query used a location and calculated the distance from that location to each of the hotels. Both columns are of type geometry
:
SELECT ST_Distance(location_coordinate, hotel_coordinate)
This however returns the value as degrees
. To get the value in meters
we can use another Spatial Reference Identifier
also known as SRID
.
The query I was debugging used SRID 3857
. A projected 2D coordinate system.
SELECT ST_Distance(
ST_Transform(location_coordinate, 3857),
ST_Transform(hotel_coordinate, 3857)
)
Now we have the distance in meters, but the value seems off? I initially thought this was an optimization to speed up the query at the cost of accuracy. But the result was off by roughly 50% (real = 195km, calculated = 325km).
At this point I was confused enough to consult the PostGIS documentation. And sure enough the documentation acknowledges the inaccuracy:
Geometry example - units in meters (SRID: 3857, proportional to pixels on popular web maps). Although the value is off, nearby ones can be compared correctly, which makes it a good choice for algorithms like KNN or KMeans.
You now have 2 options:
- keep using SRID 3857 and apply a correction factor
value * cosd(latitude)
- cast the values to the
geography
type to account for the spherical nature of our earth (geodesic distance)
For the hotel search I chose the 2nd option because it was more convenient at the time:
SELECT ST_Distance(
location_coordinate::geography,
hotel_coordinate::geography
)
Comparison
naive_in_degrees | naive | corrected | precise |
---|---|---|---|
2.03075 degrees | 325 km | 198 km | 194 km |
SELECT
ST_Distance(rostock::geometry, berlin::geometry) AS naive_in_degrees,
ST_Distance(ST_Transform(rostock::geometry, 3857), ST_Transform(berlin::geometry, 3857)) AS naive,
ST_Distance(ST_Transform(rostock::geometry, 3857), ST_Transform(berlin::geometry, 3857)) * cosd (52.520008) AS corrected,
ST_Distance(rostock::geography, berlin::geography) AS precise
FROM (
SELECT
'SRID=4326;POINT(13.404954 52.520008)' AS berlin,
'SRID=4326;POINT(12.108811 54.083336)' AS rostock) AS locations;
Performance
You can expect the geometry
type to be faster than geography
. Take a look at this PostGIS FAQ section for more information.