Skip to content

Lesson 1 — What PostGIS adds

Until now your dispatch app has had PostGIS installed and enabled in its database, but we haven’t really used it. The users table from the auth generator has no location data; the chassis seeds have no spatial content. The map on the Lab page renders without ever talking to the database.

This lesson changes that. We’ll look at what PostGIS actually adds to a Postgres database, the choice of column types it offers, how spatial values are represented, and the spatial functions we’ll be using throughout the rest of the tutorial. By the end you’ll have a clear picture of what “spatial” means in Postgres and what the rest of the module is going to build on.

No Ruby code in this lesson. We’ll be in psql exploring what the database can already do.

The line between regular and spatial Postgres

Stock PostgreSQL has columns for numbers, text, booleans, dates, JSON, arrays, and a few exotic types. None of them understand geography. You can store coordinates as two floating-point columns (latitude, longitude) and write queries that compare them, but Postgres doesn’t know the values are coordinates. To Postgres they’re just numbers — it can sort them, average them, but it has no concept of “distance between two points” or “is this point inside this polygon.”

PostGIS changes that. It adds:

  • Spatial column typesgeometry and geography — for storing spatial values (points, lines, polygons, and a few more exotic shapes).
  • A library of spatial functions — about 700 of them — that operate on those types. ST_Distance, ST_Within, ST_Intersects, ST_Buffer, and so on.
  • Spatial indexes (specifically GiST and SP-GiST) that make spatial queries fast. We’ll learn how these work in Module 7; for now, know that they exist and that they’re essential for performance at scale.
  • Format conversions between binary, text, and GeoJSON representations of geometry, so spatial data flows cleanly between the database, your application, and your map.

That’s a lot, and it’s why PostGIS is its own database extension rather than a few utility functions. PostGIS is doing real geospatial work — the kind a GIS workstation does — inside your Postgres instance.

The PostGIS extension

PostGIS is a PostgreSQL extension. Extensions are a Postgres mechanism for adding functionality without modifying the core. Once installed at the system level — which you did in Module 2 by installing the postgresql-N-postgis-3 package or its equivalent — they become available to any database on that server but aren’t enabled by default. You enable an extension per database with a single SQL command:

1
CREATE EXTENSION postgis;

The dispatch deck has PostGIS enabled in its development database already, courtesy of activerecord-postgis-adapter, which automatically invokes PostGIS during db:create.

Why you might still want an explicit migration

The adapter enables PostGIS for you on db:create, but the result of that automatic invocation isn’t visible in your schema.rb. For schema-transparency reasons, many Rails projects add an explicit migration anyway:

1
2
3
4
5
class EnablePostgis < ActiveRecord::Migration[8.0]
  def change
    enable_extension "postgis"
  end
end

This is idempotent — enable_extension is essentially CREATE EXTENSION IF NOT EXISTS postgis, so it’s safe to run even if the adapter has already enabled the extension.

The chassis ships without this migration; nothing breaks if you skip it. The PostGIS project’s documentation covers extension management in detail at https://postgis.net/documentation/.

Verifying what’s there

Let’s see what PostGIS gives us. Start the development database console:

1
bin/rails dbconsole

This drops you into psql connected to your development database. From here, the simplest verification:

1
vera_dispatch_manager_development=# SELECT PostGIS_Full_Version();

The result is a verbose multi-line string with PostGIS’s version, the C library versions it’s compiled against (GEOS, PROJ, and a few others), and which optional features are available. Something like:

1
2
3
POSTGIS="3.4.2" [EXTENSION] PGSQL="160" GEOS="3.12.1-CAPI-1.18.1"
PROJ="9.3.1 NETWORK_ENABLED=OFF ..." JSON-C
LIBPROTOBUF-C="1.5.0" ...

The version itself doesn’t matter much for what we’re doing — any 3.x is fine — but seeing this banner at all confirms PostGIS is enabled in this database.

Type \q to exit when you’re done.

If you don’t see any postgis information then the extension might not have been enabled. Create the migration as mentioned above, run it then enter the dboconsole again.

Alternatively in psql (i.e. in the dbconsole) you can run

1
# CREATE EXTENSION PostGIS;

A note on SRIDs

You’ll see “SRID” mentioned throughout this lesson and the modules ahead. SRID stands for Spatial Reference IDentifier — a number that tells PostGIS which coordinate system the values in a column should be interpreted as. The number itself is just a registry ID; it doesn’t carry mathematical meaning, it points at a specification.

The most important one for our purposes is SRID 4326, which identifies the WGS84 lat/lng system. This is what GPS devices report, what GeoJSON expects, what web maps render in, and what the dispatch deck uses throughout. When you see “SRID 4326” anywhere in this tutorial, it means “ordinary lat/lng coordinates.”

Other SRIDs identify other coordinate systems — projected systems where coordinates are in metres on a flat plane, for instance. Lesson 3 unpacks the full picture. For now, “SRID 4326 is what we use” is enough to follow along.

The choice of spatial column type

PostGIS gives you three ways to organise spatial data, suited to different applications. Knowing which approach fits your app is one of the most consequential decisions you make.

Approach 1 — geometry column with SRID 4326 (lat/lng). This is what the dispatch deck uses, and what most web-based spatial apps use. Coordinates are stored as latitude and longitude in the WGS84 reference system — the same one GPS uses, GeoJSON expects, and web maps render in. Shape-comparison queries (point-in-polygon, polygon-overlap, within-region) work perfectly. For real-world distance queries that need metres, you use ST_DistanceSphere, which we’ll meet shortly.

Approach 2 — geometry column in a projected SRID. Surveying-grade work — measuring property boundaries, road centrelines, infrastructure positions to centimetre or millimetre accuracy — typically uses a projected coordinate system where the unit is metres directly. In Australia this means SRIDs like 28356 (MGA Zone 56, covering Sydney and the east coast) or 7855 (MGA2020 Zone 55, covering Melbourne). Areas come out in square metres. Distances come out in metres. The math is fast and correct within the zone the projection is designed for. The trade-off is that each projection is optimised for a specific geographic region; you can’t use one SRID for the whole country.

Approach 3 — geography column. Used when you genuinely need rigorous spheroidal math at intercontinental scale — shipping routes, global flight paths, worldwide weather. The math accounts for Earth’s curvature using the WGS84 ellipsoidal model. It’s slower than geometry calculations, but it’s correct anywhere on Earth without needing to re-project.

For the dispatch deck — Australia-spanning service-area work, job dispatch, technician positioning — Approach 1 is the right choice. Survey-grade accuracy isn’t a goal (we don’t care if a job pin is off by a few centimetres on the map), and our queries don’t span continents. We’ll use geometry with SRID 4326 throughout. Lesson 3 unpacks SRIDs in more depth; it’ll be clear by then why 4326 is the natural default.

Approach 2 is worth knowing about even though we won’t use it, because it’s the right choice for many adjacent applications — anything involving cadastral data, civil engineering, or precise field measurements.

Approach 3 is mentioned for completeness. If you’re ever working on a global-scale problem, you’ll know to reach for geography.

What spatial values actually look like

Let’s create a small geometry value in psql and see what’s stored. Open a database console:

1
bin/rails dbconsole

PostGIS gives us a few constructor functions; the simplest is ST_MakePoint:

1
vera_dispatch_manager_development=# SELECT ST_MakePoint(151.2093, -33.8688);

The result:

1
2
3
            st_makepoint
--------------------------------------------
 0101000000B81E85EB51385F40D7A3703D0AD740C0

That hex string is the WKB (Well-Known Binary) encoding — PostGIS’s internal binary format for geometry values. It’s compact and machine-readable, but unreadable to humans.

PostGIS provides functions to convert geometry to formats that are more useful for inspection. ST_AsText gives you WKT (Well-Known Text):

1
vera_dispatch_manager_development=# SELECT ST_AsText(ST_MakePoint(151.2093, -33.8688));
1
2
3
        st_astext
-------------------------
 POINT(151.2093 -33.8688)

POINT(x y) is WKT’s syntax for a point. Notice the order: longitude first, latitude second. WKT is mathematics-convention, like GeoJSON. (We met this convention in Module 2 Lesson 4.)

ST_AsGeoJSON gives you GeoJSON:

1
vera_dispatch_manager_development=# SELECT ST_AsGeoJSON(ST_MakePoint(151.2093, -33.8688));
1
2
3
                    st_asgeojson
----------------------------------------------------
 {"type":"Point","coordinates":[151.2093,-33.8688]}

Same point, different format. WKT is good for human-readable inspection in psql. GeoJSON is what your application emits when serving spatial data to a browser. WKB is what’s actually stored in the column.

Activity 1 — Inspect a polygon

In psql, build a small polygon (a triangle) and inspect it:

1
2
3
4
5
SELECT ST_AsText(
  ST_MakePolygon(
    ST_GeomFromText('LINESTRING(0 0, 1 0, 0 1, 0 0)')
  )
);

You should see POLYGON((0 0,1 0,0 1,0 0)). A polygon’s WKT uses doubled parentheses, and a triangle has four points — the first and last are the same to close the shape. (In GIS, these shapes are called “rings.”)

Compute its area:

1
2
3
4
5
SELECT ST_Area(
  ST_MakePolygon(
    ST_GeomFromText('LINESTRING(0 0, 1 0, 0 1, 0 0)')
  )
);

You should see 0.5 — the area of a triangle with legs of 1 in whatever coordinate units we’re using. (Without an SRID, PostGIS just treats the numbers as Cartesian. Real-world geometry with an SRID would return area in square units of that SRID.)

This is the simplest kind of spatial computation: a function that takes geometry and returns a number. The library has many of these; we’ll see them as they come up.

Distance queries

The dispatch deck will eventually need to answer questions like “which technician is closest to this new job?” That’s a distance query, and it needs to return real-world metres.

The natural function to reach for is ST_Distance. But on a geometry column with SRID 4326, ST_Distance returns Cartesian distance in degrees — not metres. Degrees of latitude and longitude don’t measure the same thing on the ground (a degree of longitude shrinks toward the poles), so the result has no useful physical meaning.

For real-world distances, PostGIS provides ST_DistanceSphere. It works directly on geometry columns with lat/lng coordinates and returns the answer in metres, treating the Earth as a sphere:

1
2
3
4
SELECT ST_DistanceSphere(
  ST_GeomFromText('POINT(151.2093 -33.8688)', 4326),
  ST_GeomFromText('POINT(153.0251 -27.4698)', 4326)
);

The result is around 732,000 — that’s metres, so 732 km. The distance from Sydney to Brisbane.

ST_DistanceSphere uses a spherical Earth model, which is slightly less accurate than the more rigorous ST_DistanceSpheroid (which uses the WGS84 ellipsoidal model). For typical dispatch-distance queries, the difference is meaningless — over 1000 km the spherical answer is wrong by a few hundred metres at most, which doesn’t change “which technician is closest.” We use ST_DistanceSphere throughout the dispatch deck.

This is the pattern we’ll see again in Module 9 when the dispatcher’s view computes “find the closest technician to this new job.” A query like:

1
2
3
4
SELECT id, name, ST_DistanceSphere(location, 'POINT(151.2 -33.9)') AS metres
FROM technicians
ORDER BY metres ASC
LIMIT 5;

returns the five technicians nearest a given point, with their distances in metres. The whole query operates on geometry columns, no casting, no projection juggling.

Activity 2 — Compute real distances

Try a few cross-Australia distances using ST_DistanceSphere.

Sydney to Melbourne:

1
2
3
4
SELECT ST_DistanceSphere(
  ST_GeomFromText('POINT(151.2093 -33.8688)', 4326),
  ST_GeomFromText('POINT(144.9631 -37.8136)', 4326)
);

You should see something close to 713,000 — about 713 km.

Sydney to Perth:

1
2
3
4
SELECT ST_DistanceSphere(
  ST_GeomFromText('POINT(151.2093 -33.8688)', 4326),
  ST_GeomFromText('POINT(115.8575 -31.9523)', 4326)
);

Around 3,290,000 — 3,290 km, the genuinely long cross-Australia run from east coast to west coast. You can verify both numbers against any online map’s “directions” tool.

Notice that you didn’t have to specify units, perform any casts, or project the points. ST_DistanceSphere takes geometry values with SRID 4326 directly and returns metres. This is the cleanest pattern for the dispatch deck’s distance queries.

Type \q to exit psql when you’re done.

The spatial function library

ST_MakePoint, ST_AsText, ST_Area, ST_DistanceSphere — PostGIS adds about 700 functions, all prefixed with ST_ (for “Spatial Type,” after the SQL/MM standard). They cluster into rough categories:

  • Constructors that build geometry values from coordinates, text, or other formats: ST_MakePoint, ST_GeomFromText, ST_GeomFromGeoJSON.
  • Format converters that turn geometry into other representations: ST_AsText, ST_AsGeoJSON, ST_AsBinary.
  • Measurements that compute scalar values from geometry: ST_Distance, ST_DistanceSphere, ST_Length, ST_Area, ST_Perimeter.
  • Predicates that test relationships between geometries and return booleans: ST_Within, ST_Intersects, ST_Contains, ST_DWithin (within a given distance).
  • Constructive operations that build new geometry from existing geometry: ST_Buffer, ST_Union, ST_Intersection, ST_Centroid.
  • Accessors for inspecting geometry: ST_X, ST_Y, ST_GeometryType, ST_NPoints, ST_SRID.

We’ll meet many of these across the rest of the tutorial. You don’t need to memorise the catalog. The PostGIS reference at https://postgis.net/docs/reference.html is comprehensive, well organised, and what you’ll reach for when you need a function you haven’t met yet.

Spatial indexes

One last piece worth mentioning here, even though we won’t use it until Module 7: spatial indexes.

A regular B-tree index — the default Postgres index type — is perfect for indexing scalar values that have a natural order: integers, dates, strings. It’s useless for indexing geometries. “Is POINT(151, -33) less than POINT(0, 0)?” doesn’t have a meaningful answer.

PostGIS uses GiST indexes (Generalised Search Tree) for spatial data. A GiST index on a geometry column doesn’t store the full geometries; it stores their bounding boxes. Any spatial query that can be reduced to a bounding-box check benefits hugely from the index — usually 100× to 1000× faster than scanning every row.

For now, just know that the right answer to “my spatial query is slow” is almost always “add a GiST index.” We’ll make this concrete in Module 7. By then we’ll have a half-million job records and the difference will be very visible.

Where this leaves us

You now know:

  • PostgreSQL on its own can store coordinates but doesn’t understand them spatially. PostGIS is the extension that adds spatial awareness.
  • PostGIS adds spatial column types (geometry and geography), about 700 spatial functions, and spatial indexes (GiST).
  • Three approaches to spatial column choice exist; the dispatch deck uses geometry with SRID 4326 because it fits web-based regional dispatch work.
  • Geometry values are stored as binary (WKB) but can be converted to WKT for inspection or GeoJSON for transmission.
  • Coordinates in WKT/GeoJSON/MapLibre/PostGIS are longitude-first.
  • ST_DistanceSphere returns real-world metres directly from geometry columns — the natural function for distance queries the dispatch deck will run.

In the next lesson we’ll add a Depot model with a geometry column to the dispatch app, insert one depot — Sydney’s warehouse — via SQL, and query it back. The first row of real spatial data in the database.