Skip to content

Appendix B — PostGIS Reference

This appendix is reference material for the PostGIS concepts and functions that appear throughout the tutorial. The lessons introduce these as needed and link here for fuller treatment.

Coordinate reference systems and SRIDs

A coordinate reference system (CRS) is a framework for describing positions on Earth. Different CRSs make different trade-offs between accuracy, simplicity, and the kind of operations they support well.

In PostGIS, every geometry has an SRID — a Spatial Reference ID that identifies its CRS. SRIDs are integers, drawn from the spatial_ref_sys system table.

SRID 4326 — WGS 84

The most common CRS for global geographic data. Coordinates are longitude and latitude in degrees, on the WGS 84 ellipsoid (the same ellipsoid used by GPS).

This tutorial uses SRID 4326 throughout. Customer locations, depot locations, service area boundaries — all stored as SRID 4326 geometries.

A point in SRID 4326 looks like POINT(151.2093 -33.8688) (Sydney Opera House). Note the order: longitude first, latitude second. This trips a lot of developers up because most non-PostGIS contexts (Google Maps, postal addresses, etc.) say “lat/lng.” In PostGIS, longitude is the X coordinate; latitude is the Y.

Other SRIDs you may encounter

  • 3857 — Web Mercator. Used by web maps (Google, OpenStreetMap, MapLibre tile rendering). Coordinates are metres, but distances are not Earth-accurate at non-equatorial latitudes.
  • 3577 — GDA94 / Australian Albers. Equal-area projection optimised for Australia. Used in some Australian government datasets.
  • 7844 — GDA2020. The current official Australian CRS.

When you import geographic data, it usually comes with a declared SRID. PostGIS will reproject between CRSs using ST_Transform.

Picking an SRID for storage

For most general-purpose web and mobile applications: store as 4326. It’s the lingua franca of geographic data exchange. Web maps consume it natively. Most data sources publish it. PostGIS handles the spheroidal calculations correctly via the geography cast (next section).

The exception is when your application is regional and you have strict accuracy requirements (millimetre-precision surveying, for example). Then a regional projected SRID (Albers, UTM, state plane, etc.) might be better.

Geometry vs geography

PostGIS has two spatial types: geometry and geography. Both store points, lines, polygons. They differ in how they treat distance and topology.

Geometry

Treats coordinates as Cartesian — a flat plane. Distances are computed with the Pythagorean theorem. Areas are computed assuming a flat surface.

For SRID 4326 (degrees), this means ST_Distance(a, b) returns the distance in degrees. Not metres. Not kilometres. Degrees of longitude/latitude, computed as if Earth were flat.

This is fast — the math is trivial — but wrong for “how far apart are these in metres?”

When geometry is the right choice:

  • Filtering by approximate proximity (using ST_DWithin with a degree-based bound) — fast, indexed
  • Topology operations (ST_Contains, ST_Within, ST_Intersects) — correct regardless of units
  • When the SRID is already in metres (3857, projected CRSs) — distance is then in metres

Geography

Treats coordinates as spheroidal — points on a curved Earth. Distances are computed using great-circle / ellipsoidal formulas.

For any geography column or ::geography cast, ST_Distance(a, b) returns distance in metres. Always metres, regardless of SRID.

This is slower than geometry math but accurate.

When geography is the right choice:

  • Computing actual distances in metres (proximity reports, travel distances, “how far is X from Y”)
  • Cross-region calculations where Earth’s curvature matters
  • Anywhere the answer needs to be in metres

The cast pattern

The standard tutorial pattern: store as geometry, cast to geography for distance.

1
2
SELECT ST_Distance(a.location::geography, b.location::geography) AS metres
FROM ...

This gets you geometry’s storage efficiency and indexing, plus geography’s accurate metres for distance calculations. Most production systems do this.

The catch: the geography cast disables index usage for proximity filters. ST_DWithin(a.location::geography, b.location::geography, 5000) does not use the geometry’s GIST index. If the predicate needs to be indexed, use geometry with a degree-based distance:

1
ST_DWithin(a.location, b.location, 0.05)   -- ~5km, uses index

The hybrid pattern combines both:

1
2
3
WHERE ST_DWithin(a.location, b.location, 0.05)            -- indexed filter
ORDER BY ST_Distance(a.location::geography,                -- accurate distance
                     b.location::geography)

Filter wide on geometry (uses index, fast); order precise on geography (accurate metres, runs on the small filtered set).

Spatial indexes

PostGIS uses GIST (Generalized Search Tree) indexes for spatial data. The index stores each geometry’s bounding box — the smallest rectangle that contains the geometry.

A GIST index can rapidly answer: “give me all geometries whose bounding boxes intersect this region.” Two related operators exploit this:

The && operator

a && b returns true if a’s bounding box intersects b’s bounding box. It’s index-friendly: GIST can answer it directly.

&& is only a bounding-box check. Two geometries can have intersecting bounding boxes without actually intersecting (think two narrow polygons crossing each other diagonally). For correct results you usually combine && with a precise function like ST_Intersects.

In practice you rarely write && directly. PostGIS functions like ST_DWithin, ST_Contains, and ST_Intersects use && internally as a pre-filter, then refine with the precise check.

The bounding-box trick

How ST_DWithin(a, b, X) uses the index:

  1. Expand a’s bounding box by X in each direction
  2. Use the GIST index to find candidate geometries whose bounding boxes intersect this expanded box
  3. For those candidates only, compute the actual distance and check it against X

The expensive step is step 3. The index does step 1-2 quickly, typically reducing millions of rows to a handful of candidates.

You can see this in EXPLAIN ANALYZE output:

Index Scan using index_jobs_on_location
  Index Cond: (location && st_expand(query_point, 0.05))
  Filter: st_dwithin(location, query_point, 0.05)

The Index Cond is the index doing its work. The Filter is the precise check on candidates.

Creating a spatial index

In a Rails migration:

1
add_index :jobs, :location, using: :gist

Or in raw SQL:

1
CREATE INDEX index_jobs_on_location ON jobs USING GIST (location);

The tutorial schema has GIST indexes on every geometry column.

Distance functions

ST_Distance

ST_Distance(geom_a, geom_b) returns the distance between two geometries. Units depend on the type:

  • Geometry: distance in the SRID’s native units (degrees for 4326, metres for projected CRSs)
  • Geography: distance in metres, always

Not index-friendly when used in a WHERE clause: WHERE ST_Distance(...) < X requires computing the distance for every row before comparison.

For ordering (in ORDER BY clauses), this isn’t a problem — order operations don’t use indexes the same way.

ST_DWithin

ST_DWithin(geom_a, geom_b, distance) returns boolean: are the two geometries within the given distance of each other?

Index-friendly with geometry. Not index-friendly with geography (see Geometry vs geography above).

The standard proximity filter for indexed queries.

ST_Length

ST_Length(line) returns the length of a line (or multi-line). Same unit rules as ST_Distance — degrees for geometry/4326, metres for geography.

ST_Area

ST_Area(polygon) returns the area of a polygon. Geometry returns square units of the SRID; geography returns square metres.

Topology functions

These functions ask geometric relationship questions — does X contain Y, do they overlap, do they touch? They return boolean and don’t depend on units.

ST_Contains

ST_Contains(a, b) is true if a completely contains b. The boundary doesn’t count: a point exactly on a polygon’s edge is not contained by the polygon under ST_Contains.

Most common use: “find features inside a region.”

1
2
3
4
5
SELECT * FROM customers
WHERE ST_Contains(
  (SELECT boundary FROM service_areas WHERE code = 'BNE001'),
  customers.location
);

Index-friendly via the bounding-box pre-filter.

ST_Within

ST_Within(a, b) — is a completely within b? The inverse of ST_Contains. ST_Within(a, b) is the same as ST_Contains(b, a).

ST_Intersects

ST_Intersects(a, b) is true if the two geometries share any point. Includes boundaries: two polygons that share an edge “intersect.”

More inclusive than ST_ContainsST_Contains requires complete containment; ST_Intersects accepts any shared point.

ST_Touches

ST_Touches(a, b) is true if the geometries share boundary points but not interior points. Two polygons that share an edge “touch” but don’t overlap.

Useful for finding adjacent regions.

ST_Overlaps

ST_Overlaps(a, b) — partial overlap. The geometries share interior points but neither contains the other. Two intersecting polygons where each has area outside the other.

ST_Disjoint

ST_Disjoint(a, b) — the geometries share no points at all. The opposite of ST_Intersects.

Format functions

Functions that convert geometries to and from external formats.

ST_AsGeoJSON

ST_AsGeoJSON(geom) returns the geometry as a GeoJSON string. Use ::jsonb cast to get a JSONB value directly:

1
SELECT ST_AsGeoJSON(location)::jsonb FROM customers;

The standard way to ship geometries to web clients (MapLibre, Mapbox GL JS, Leaflet — all consume GeoJSON natively).

For full FeatureCollection construction, combine with jsonb_build_object:

1
2
3
4
5
6
7
8
SELECT jsonb_build_object(
  'type', 'FeatureCollection',
  'features', jsonb_agg(jsonb_build_object(
    'type', 'Feature',
    'geometry', ST_AsGeoJSON(location)::jsonb,
    'properties', jsonb_build_object('name', name)
  ))
) FROM customers;

Returns a single JSON document. Faster than building it in Ruby for large result sets.

ST_AsText

ST_AsText(geom) returns the geometry in WKT (Well-Known Text) format:

POINT(151.2093 -33.8688)
LINESTRING(0 0, 1 1, 2 2)
POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))

Mostly useful for debugging — looking at geometries in a human-readable form. Not great for shipping data.

ST_GeomFromText

ST_GeomFromText(wkt, srid) parses WKT back into a geometry. The inverse of ST_AsText.

1
ST_GeomFromText('POINT(151.2093 -33.8688)', 4326)

Used for inserting test data, importing from text-based formats, manually constructing geometries.

ST_GeomFromGeoJSON

ST_GeomFromGeoJSON(geojson_text) parses GeoJSON into a geometry. The inverse of ST_AsGeoJSON.

Used for ingesting GeoJSON data — file imports, API payloads, etc.

Validity and repair (introduced in Module 7)

Geometric data, especially boundaries imported from external sources, can be invalid in subtle ways: self-intersections, ring ordering issues, duplicate vertices, degenerate components.

Many spatial operations behave unpredictably on invalid geometries — ST_Contains might return wrong results, ST_Area might be negative, ST_Buffer might fail.

ST_IsValid

ST_IsValid(geom) returns boolean: is the geometry valid?

1
2
3
SELECT id, code, ST_IsValid(boundary)
FROM service_areas
WHERE NOT ST_IsValid(boundary);

Useful as a data quality check after imports.

ST_IsValidReason

ST_IsValidReason(geom) returns a string describing what’s invalid, or 'Valid Geometry' if the geometry is fine. Useful for diagnosing.

ST_MakeValid

ST_MakeValid(geom) attempts to repair an invalid geometry, returning a valid version. The result might be a different geometry type than the input (e.g., a self-intersecting polygon might become a multi-polygon).

1
2
3
UPDATE service_areas
SET boundary = ST_MakeValid(boundary)
WHERE NOT ST_IsValid(boundary);

Standard fix-up step after imports.

Quick reference: function families

A mental map of the function namespace:

  • ST_Distance, ST_DWithin, ST_Length, ST_Area — measurement
  • ST_Contains, ST_Within, ST_Intersects, ST_Touches, ST_Overlaps, ST_Disjoint — topology
  • ST_AsText, ST_AsGeoJSON, ST_GeomFromText, ST_GeomFromGeoJSON — format conversion
  • ST_IsValid, ST_MakeValid, ST_IsValidReason — validity
  • ST_Buffer, ST_Union, ST_Difference, ST_Intersection, ST_ConvexHull — geometry construction (covered in Module 7)
  • ST_Transform, ST_SetSRID — coordinate system handling
  • ST_X, ST_Y, ST_NumPoints, ST_Boundary — accessing geometry properties

The full PostGIS reference is at https://postgis.net/docs/; this appendix covers the functions actively used in the tutorial.

When to use what

A mental cheat sheet for spatial work:

  • Need true distances in metres? → Cast to geography for the measurement
  • Need to filter by proximity at scale? → Use geometry with ST_DWithin and a degree-based bound
  • Need to find features inside regions?ST_Contains (or ST_Within)
  • Need to find adjacent regions?ST_Touches
  • Need to ship geometries to a web client?ST_AsGeoJSON
  • Suspicious of imported data?ST_IsValid + ST_MakeValid
  • Slow spatial query? → Check that GIST indexes exist, that the cast pattern isn’t blocking them, and that the FK alternative isn’t appropriate (see EXPLAIN ANALYZE reference)