Appendix B — PostGIS Reference
This appendix is reference material for the PostGIS concepts and functions that appear throughout the tutorial. The lessons introduce these as needed and link here for fuller treatment.
Coordinate reference systems and SRIDs
A coordinate reference system (CRS) is a framework for describing positions on Earth. Different CRSs make different trade-offs between accuracy, simplicity, and the kind of operations they support well.
In PostGIS, every geometry has an SRID — a Spatial Reference
ID that identifies its CRS. SRIDs are integers, drawn from the
spatial_ref_sys system table.
SRID 4326 — WGS 84
The most common CRS for global geographic data. Coordinates are longitude and latitude in degrees, on the WGS 84 ellipsoid (the same ellipsoid used by GPS).
This tutorial uses SRID 4326 throughout. Customer locations, depot locations, service area boundaries — all stored as SRID 4326 geometries.
A point in SRID 4326 looks like POINT(151.2093 -33.8688) (Sydney
Opera House). Note the order: longitude first, latitude
second. This trips a lot of developers up because most non-PostGIS
contexts (Google Maps, postal addresses, etc.) say “lat/lng.” In
PostGIS, longitude is the X coordinate; latitude is the Y.
Other SRIDs you may encounter
- 3857 — Web Mercator. Used by web maps (Google, OpenStreetMap, MapLibre tile rendering). Coordinates are metres, but distances are not Earth-accurate at non-equatorial latitudes.
- 3577 — GDA94 / Australian Albers. Equal-area projection optimised for Australia. Used in some Australian government datasets.
- 7844 — GDA2020. The current official Australian CRS.
When you import geographic data, it usually comes with a declared
SRID. PostGIS will reproject between CRSs using ST_Transform.
Picking an SRID for storage
For most general-purpose web and mobile applications: store as 4326. It’s the lingua franca of geographic data exchange. Web maps consume it natively. Most data sources publish it. PostGIS handles the spheroidal calculations correctly via the geography cast (next section).
The exception is when your application is regional and you have strict accuracy requirements (millimetre-precision surveying, for example). Then a regional projected SRID (Albers, UTM, state plane, etc.) might be better.
Geometry vs geography
PostGIS has two spatial types: geometry and geography. Both
store points, lines, polygons. They differ in how they treat
distance and topology.
Geometry
Treats coordinates as Cartesian — a flat plane. Distances are computed with the Pythagorean theorem. Areas are computed assuming a flat surface.
For SRID 4326 (degrees), this means ST_Distance(a, b) returns the
distance in degrees. Not metres. Not kilometres. Degrees of
longitude/latitude, computed as if Earth were flat.
This is fast — the math is trivial — but wrong for “how far apart are these in metres?”
When geometry is the right choice:
- Filtering by approximate proximity (using
ST_DWithinwith a degree-based bound) — fast, indexed - Topology operations (
ST_Contains,ST_Within,ST_Intersects) — correct regardless of units - When the SRID is already in metres (3857, projected CRSs) — distance is then in metres
Geography
Treats coordinates as spheroidal — points on a curved Earth. Distances are computed using great-circle / ellipsoidal formulas.
For any geography column or ::geography cast, ST_Distance(a, b)
returns distance in metres. Always metres, regardless of
SRID.
This is slower than geometry math but accurate.
When geography is the right choice:
- Computing actual distances in metres (proximity reports, travel distances, “how far is X from Y”)
- Cross-region calculations where Earth’s curvature matters
- Anywhere the answer needs to be in metres
The cast pattern
The standard tutorial pattern: store as geometry, cast to geography for distance.
|
|
This gets you geometry’s storage efficiency and indexing, plus geography’s accurate metres for distance calculations. Most production systems do this.
The catch: the geography cast disables index usage for
proximity filters. ST_DWithin(a.location::geography, b.location::geography, 5000) does not use the geometry’s GIST
index. If the predicate needs to be indexed, use geometry with a
degree-based distance:
|
|
The hybrid pattern combines both:
|
|
Filter wide on geometry (uses index, fast); order precise on geography (accurate metres, runs on the small filtered set).
Spatial indexes
PostGIS uses GIST (Generalized Search Tree) indexes for spatial data. The index stores each geometry’s bounding box — the smallest rectangle that contains the geometry.
A GIST index can rapidly answer: “give me all geometries whose bounding boxes intersect this region.” Two related operators exploit this:
The && operator
a && b returns true if a’s bounding box intersects b’s
bounding box. It’s index-friendly: GIST can answer it directly.
&& is only a bounding-box check. Two geometries can have
intersecting bounding boxes without actually intersecting (think
two narrow polygons crossing each other diagonally). For correct
results you usually combine && with a precise function like
ST_Intersects.
In practice you rarely write && directly. PostGIS functions like
ST_DWithin, ST_Contains, and ST_Intersects use &&
internally as a pre-filter, then refine with the precise check.
The bounding-box trick
How ST_DWithin(a, b, X) uses the index:
- Expand
a’s bounding box byXin each direction - Use the GIST index to find candidate geometries whose bounding boxes intersect this expanded box
- For those candidates only, compute the actual distance and
check it against
X
The expensive step is step 3. The index does step 1-2 quickly, typically reducing millions of rows to a handful of candidates.
You can see this in EXPLAIN ANALYZE output:
Index Scan using index_jobs_on_location
Index Cond: (location && st_expand(query_point, 0.05))
Filter: st_dwithin(location, query_point, 0.05)The Index Cond is the index doing its work. The Filter is the
precise check on candidates.
Creating a spatial index
In a Rails migration:
|
|
Or in raw SQL:
|
|
The tutorial schema has GIST indexes on every geometry column.
Distance functions
ST_Distance
ST_Distance(geom_a, geom_b) returns the distance between two
geometries. Units depend on the type:
- Geometry: distance in the SRID’s native units (degrees for 4326, metres for projected CRSs)
- Geography: distance in metres, always
Not index-friendly when used in a WHERE clause:
WHERE ST_Distance(...) < X requires computing the distance for
every row before comparison.
For ordering (in ORDER BY clauses), this isn’t a problem — order operations don’t use indexes the same way.
ST_DWithin
ST_DWithin(geom_a, geom_b, distance) returns boolean: are the two
geometries within the given distance of each other?
Index-friendly with geometry. Not index-friendly with geography (see Geometry vs geography above).
The standard proximity filter for indexed queries.
ST_Length
ST_Length(line) returns the length of a line (or
multi-line). Same unit rules as ST_Distance — degrees for
geometry/4326, metres for geography.
ST_Area
ST_Area(polygon) returns the area of a polygon. Geometry returns
square units of the SRID; geography returns square metres.
Topology functions
These functions ask geometric relationship questions — does X contain Y, do they overlap, do they touch? They return boolean and don’t depend on units.
ST_Contains
ST_Contains(a, b) is true if a completely contains b. The
boundary doesn’t count: a point exactly on a polygon’s edge is not
contained by the polygon under ST_Contains.
Most common use: “find features inside a region.”
|
|
Index-friendly via the bounding-box pre-filter.
ST_Within
ST_Within(a, b) — is a completely within b? The inverse of
ST_Contains. ST_Within(a, b) is the same as ST_Contains(b, a).
ST_Intersects
ST_Intersects(a, b) is true if the two geometries share any point.
Includes boundaries: two polygons that share an edge “intersect.”
More inclusive than ST_Contains — ST_Contains requires complete
containment; ST_Intersects accepts any shared point.
ST_Touches
ST_Touches(a, b) is true if the geometries share boundary points
but not interior points. Two polygons that share an edge “touch”
but don’t overlap.
Useful for finding adjacent regions.
ST_Overlaps
ST_Overlaps(a, b) — partial overlap. The geometries share interior
points but neither contains the other. Two intersecting polygons
where each has area outside the other.
ST_Disjoint
ST_Disjoint(a, b) — the geometries share no points at all. The
opposite of ST_Intersects.
Format functions
Functions that convert geometries to and from external formats.
ST_AsGeoJSON
ST_AsGeoJSON(geom) returns the geometry as a GeoJSON string. Use
::jsonb cast to get a JSONB value directly:
|
|
The standard way to ship geometries to web clients (MapLibre, Mapbox GL JS, Leaflet — all consume GeoJSON natively).
For full FeatureCollection construction, combine with
jsonb_build_object:
|
|
Returns a single JSON document. Faster than building it in Ruby for large result sets.
ST_AsText
ST_AsText(geom) returns the geometry in WKT (Well-Known Text)
format:
POINT(151.2093 -33.8688)
LINESTRING(0 0, 1 1, 2 2)
POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))Mostly useful for debugging — looking at geometries in a human-readable form. Not great for shipping data.
ST_GeomFromText
ST_GeomFromText(wkt, srid) parses WKT back into a geometry. The
inverse of ST_AsText.
|
|
Used for inserting test data, importing from text-based formats, manually constructing geometries.
ST_GeomFromGeoJSON
ST_GeomFromGeoJSON(geojson_text) parses GeoJSON into a geometry.
The inverse of ST_AsGeoJSON.
Used for ingesting GeoJSON data — file imports, API payloads, etc.
Validity and repair (introduced in Module 7)
Geometric data, especially boundaries imported from external sources, can be invalid in subtle ways: self-intersections, ring ordering issues, duplicate vertices, degenerate components.
Many spatial operations behave unpredictably on invalid geometries
— ST_Contains might return wrong results, ST_Area might be
negative, ST_Buffer might fail.
ST_IsValid
ST_IsValid(geom) returns boolean: is the geometry valid?
|
|
Useful as a data quality check after imports.
ST_IsValidReason
ST_IsValidReason(geom) returns a string describing what’s invalid,
or 'Valid Geometry' if the geometry is fine. Useful for diagnosing.
ST_MakeValid
ST_MakeValid(geom) attempts to repair an invalid geometry, returning
a valid version. The result might be a different geometry type than
the input (e.g., a self-intersecting polygon might become a
multi-polygon).
|
|
Standard fix-up step after imports.
Quick reference: function families
A mental map of the function namespace:
ST_Distance,ST_DWithin,ST_Length,ST_Area— measurementST_Contains,ST_Within,ST_Intersects,ST_Touches,ST_Overlaps,ST_Disjoint— topologyST_AsText,ST_AsGeoJSON,ST_GeomFromText,ST_GeomFromGeoJSON— format conversionST_IsValid,ST_MakeValid,ST_IsValidReason— validityST_Buffer,ST_Union,ST_Difference,ST_Intersection,ST_ConvexHull— geometry construction (covered in Module 7)ST_Transform,ST_SetSRID— coordinate system handlingST_X,ST_Y,ST_NumPoints,ST_Boundary— accessing geometry properties
The full PostGIS reference is at https://postgis.net/docs/; this appendix covers the functions actively used in the tutorial.
When to use what
A mental cheat sheet for spatial work:
- Need true distances in metres? → Cast to geography for the measurement
- Need to filter by proximity at scale? → Use geometry with
ST_DWithinand a degree-based bound - Need to find features inside regions? →
ST_Contains(orST_Within) - Need to find adjacent regions? →
ST_Touches - Need to ship geometries to a web client? →
ST_AsGeoJSON - Suspicious of imported data? →
ST_IsValid+ST_MakeValid - Slow spatial query? → Check that GIST indexes exist, that the cast pattern isn’t blocking them, and that the FK alternative isn’t appropriate (see EXPLAIN ANALYZE reference)