Ruby Field Notes

Lesson 3 — FO travel analysis

Lessons 1 and 2 covered distance and proximity in isolation — how far is X from Y, what’s within N metres. This lesson asks a more substantial analytical question: how far does each field officer travel between consecutive jobs?

The question is operationally meaningful. Travel time is unpaid downtime; an FO driving from Penrith to Bondi between jobs is unproductive time the dispatch deck pays for. Aggregate travel distance per FO is a real KPI in field service operations — efficient dispatch minimises it, inefficient dispatch grows it.

The query is interesting because it requires a SQL pattern most Rails developers don’t reach for often: window functions. Specifically LAG(), which lets each row see its predecessor within an ordered partition. Combined with the spatial distance work from Lesson 1, this produces real travel distances per FO.

The lesson covers:

Window functions, focusing on LAG() for accessing previous-row values
Combining LAG() with ST_Distance for spatial difference per row
A complete travel-summary query staged with a CTE
A TravelReport page in the manager’s report section, using the existing Components::DataTable in a new mode

By the end you’ll have a new report on the dashboard, a SQL pattern that’s useful far beyond field service work, and a sharper sense of when to use AR-backed table tooling versus when to treat service-shaped data as its own thing.

The shape of the question

Each FO completes jobs in some sequence ordered by time. Consecutive jobs have locations. The distance between two consecutive locations is the travel distance for that transition.

For a single FO over a small set of jobs:

Job 1: Bondi          (started 09:15)
Job 2: Surry Hills    (started 11:30)  ← travel from Bondi
Job 3: Newtown        (started 13:45)  ← travel from Surry Hills
Job 4: Glebe          (started 15:30)  ← travel from Newtown

Three travel transitions across four jobs. The first job has no predecessor, so no travel distance applies to it.

Per-FO total travel is the sum of these transition distances over a time window.

Window functions, briefly

A window function operates on a set of rows related to the current row, producing a value per row without collapsing the result set (unlike GROUP BY, which produces one row per group).

The general syntax:

1
2
3
4


function_name() OVER (
  PARTITION BY ...
  ORDER BY ...
)

PARTITION BY divides rows into groups (like GROUP BY but without aggregating)
ORDER BY orders rows within each partition
function_name() operates over the ordered partition

LAG() returns a column’s value from a previous row in the ordered partition:

1

LAG(column_name) OVER (PARTITION BY ... ORDER BY ...)

For the first row in each partition, there’s no previous row, so LAG returns NULL.

This lesson uses LAG, but window functions cover a wider repertoire — LEAD, RANK, ROW_NUMBER, FIRST_VALUE, LAST_VALUE, plus aggregate functions used as windows. See Appendix A — Window functions for the broader treatment.

Building the query

Step by step. First, get all completed jobs in the window for one FO, ordered by start time:

1
2
3
4
5


SELECT j.id, j.started_at, j.location
FROM jobs j
WHERE j.assigned_to_id = 1
  AND j.completed_at >= NOW() - INTERVAL '90 days'
ORDER BY j.started_at;

Substitute an FO id you have. The result is the sequence we want to analyse.

Now add LAG() to give each row visibility of the previous location:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


SELECT j.id,
       j.started_at,
       j.location AS current_location,
       LAG(j.location) OVER (
         PARTITION BY j.assigned_to_id
         ORDER BY j.started_at
       ) AS previous_location
FROM jobs j
WHERE j.assigned_to_id = 1
  AND j.completed_at >= NOW() - INTERVAL '90 days'
ORDER BY j.started_at;

Each row now has both its own location and the previous job’s location. The first row’s previous_location is NULL.

The PARTITION BY j.assigned_to_id is technically redundant when we filter to one FO with WHERE j.assigned_to_id = 1 — but keeping it ensures the window function works correctly when we later run the query without the FO filter (across all FOs at once). Window functions partition naturally; the clause is the right shape.

Now add the distance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


SELECT j.id,
       j.started_at,
       ST_Distance(
         j.location::geography,
         LAG(j.location) OVER (
           PARTITION BY j.assigned_to_id
           ORDER BY j.started_at
         )::geography
       ) AS distance_m
FROM jobs j
WHERE j.assigned_to_id = 1
  AND j.completed_at >= NOW() - INTERVAL '90 days'
ORDER BY j.started_at;

Each row now has the distance from the previous job to this job, in metres. The first job’s distance is NULL (because LAG returned NULL).

Note the ::geography casts — this is a measurement query, not a filter, so the cast pattern from Lesson 1 applies. We want true metres, not Cartesian degrees.

Sample output for one FO over 90 days:

  id  |    started_at       | distance_m
------+---------------------+-------------
  142 | 2026-02-04 09:12:00 |       NULL
  287 | 2026-02-04 11:43:00 |     8124.3
  391 | 2026-02-04 14:21:00 |    12847.6
  482 | 2026-02-05 08:34:00 |    18293.7
  ...

The first job has no travel distance (no predecessor). Each subsequent job shows how far the FO travelled to get there.

Aggregating per FO

To produce a per-FO summary, we need the total travel distance over the window. The natural shape: a CTE that computes per-job distances using the window function, then an aggregation query over that result.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


WITH travel AS (
  SELECT j.assigned_to_id,
         ST_Distance(
           j.location::geography,
           LAG(j.location) OVER (
             PARTITION BY j.assigned_to_id
             ORDER BY j.started_at
           )::geography
         ) AS distance_m
  FROM jobs j
  WHERE j.assigned_to_id IS NOT NULL
    AND j.completed_at >= NOW() - INTERVAL '90 days'
)
SELECT u.id,
       u.name,
       COUNT(t.distance_m) AS transitions,
       ROUND(SUM(t.distance_m) / 1000)::int AS total_km,
       ROUND(AVG(t.distance_m) / 1000)::int AS avg_km
FROM travel t
JOIN users u ON u.id = t.assigned_to_id
WHERE t.distance_m IS NOT NULL
GROUP BY u.id, u.name
ORDER BY total_km DESC
LIMIT 20;

Output:

 id  |       name        | transitions | total_km | avg_km
-----+-------------------+-------------+----------+--------
  47 | Asher Adams       |         182 |     5247 |     29
  23 | Lily Lawson       |         174 |     4968 |     29
 141 | Marcus Murphy     |         168 |     4523 |     27
 ...

A few things worth understanding about this query.

The CTE separates concerns. The travel CTE produces one row per job with the per-job transition distance. The outer query aggregates that into per-FO totals. Doing it as a single query with both the window function and the aggregation in the same SELECT is awkward — window functions operate on the un-aggregated row set, while GROUP BY collapses rows. The CTE pattern keeps each step clean. See Appendix A — Common Table Expressions for more on when CTEs are necessary, not just nice.

COUNT(t.distance_m) ignores NULLs. Each FO has one “first job” with NULL distance (their entry into the window). COUNT(t.distance_m) skips those automatically; the result is the number of transitions — moves between jobs — not the total job count.

The WHERE t.distance_m IS NOT NULL in the outer query is technically redundant given that COUNT, SUM, and AVG all skip NULLs. We include it for clarity — it makes explicit that we’re looking at transitions, not jobs.

ROUND(... / 1000)::int converts metres to integer kilometres for a clean display.

Service object: producing the data

The query goes into a service object. Two design points worth flagging before the code: the service decides what sort means for this data, and the service returns plain hashes (not ActiveRecord objects). Both matter for the reporting page below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74


module FoTravel
  extend self

  WINDOW_DAYS    = 90
  SORT_COLUMNS   = %w[name transitions total_km avg_km].freeze
  DEFAULT_SORT   = "total_km".freeze
  DEFAULT_DIR    = "desc".freeze

  # Returns per-FO travel summary over the last 90 days. Each entry
  # gives the FO's id, name, transition count, total km travelled
  # between consecutive jobs, and average km per transition.
  #
  # Sort behaviour is part of the service's contract — the caller
  # passes sort/dir from the request, the service whitelists and
  # applies them.
  def call(sort: nil, dir: nil)
    rows = fetch_rows
    apply_sort(rows, sort, dir)
  end

  private

  def fetch_rows
    rows = ActiveRecord::Base.connection.select_all(<<~SQL).rows
      WITH travel AS (
        SELECT j.assigned_to_id,
               ST_Distance(
                 j.location::geography,
                 LAG(j.location) OVER (
                   PARTITION BY j.assigned_to_id
                   ORDER BY j.started_at
                 )::geography
               ) AS distance_m
        FROM jobs j
        WHERE j.assigned_to_id IS NOT NULL
          AND j.completed_at >= NOW() - INTERVAL '#{WINDOW_DAYS} days'
      )
      SELECT u.id,
             u.name,
             COUNT(t.distance_m) AS transitions,
             ROUND(SUM(t.distance_m) / 1000)::int AS total_km,
             ROUND(AVG(t.distance_m) / 1000)::int AS avg_km
      FROM travel t
      JOIN users u ON u.id = t.assigned_to_id
      WHERE t.distance_m IS NOT NULL
      GROUP BY u.id, u.name
    SQL

    rows.map { |id, name, transitions, total_km, avg_km|
      {
        id:          id,
        name:        name,
        transitions: transitions.to_i,
        total_km:    total_km.to_i,
        avg_km:      avg_km.to_i
      }
    }
  end

  # Whitelist sort column to known keys; default if invalid.
  # Whitelist direction to asc/desc.
  #
  # The whitelist is essential: without it, untrusted user input could
  # influence what we sort by. With it, params from the URL flow
  # safely into the sort logic.
  def apply_sort(rows, sort, dir)
    sort_key = SORT_COLUMNS.include?(sort) ? sort.to_sym : DEFAULT_SORT.to_sym
    direction = dir == "asc" ? :asc : :desc

    sorted = rows.sort_by { |row| row[sort_key] }
    sorted.reverse! if direction == :desc
    sorted
  end
end

A few things worth noting.

Whitelisted sort column. Only name, transitions, total_km, or avg_km can be sorted on. Anything else falls back to the default. This is a small but important security and reliability practice — sort columns come from URLs, which means they’re user input.

Sort happens on the result, in Ruby. With ~1500 FOs at large scale, this is a few milliseconds. For larger datasets, you’d push the sort into the SQL query (using interpolated whitelisted column names in the ORDER BY clause).

#{WINDOW_DAYS} is interpolated, not parameterised. The constant is a fixed integer — no injection risk. We could parameterise it, but for hardcoded literals interpolation is cleaner.

Returns plain hashes. The service has done analytical work to produce per-FO summaries. The result isn’t a User record (those exist), nor is it any single AR model — it’s the output of a calculation. Returning hashes is honest about that.

Run it from console:

1
2
3
4
5
6


FoTravel.call.first(3)
# => [
#   { id: 47,  name: "Asher Adams",   transitions: 182, total_km: 5247, avg_km: 29 },
#   { id: 23,  name: "Lily Lawson",   transitions: 174, total_km: 4968, avg_km: 29 },
#   { id: 141, name: "Marcus Murphy", transitions: 168, total_km: 4523, avg_km: 27 }
# ]

A small architectural distinction

This is a useful moment to think about where DataTable — the table component you’ve already used in earlier modules for AR collections — fits.

Up to now, DataTable has rendered ActiveRecord collections backed by Ransack for search/sort and Pagy for pagination. The pattern for those tables looks like:

1
2
3
4
5
6
7
8


@q = Job.visible_to(Current.user).ransack(params[:q])
@pagy, @jobs = pagy(@q.result, items: 25)

render Components::DataTable.new(rows: @jobs, pagy: @pagy, q: @q) do |t|
  t.search field: :reference_or_customer_name_cont
  t.column :reference, sort: :reference
  ...
end

Ransack works well when sort columns map to model attributes and search matchers can be expressed as predicates on those attributes. For a list of jobs, customers, orders — anywhere the rows are fundamentally an AR model’s records with simple where/joins — that’s the right shape.

But this lesson’s data isn’t that. The rows are computed per-FO summaries from a window-function query. Sort doesn’t mean “sort the underlying users table by some column”; it means “sort the calculated output.” Ransack would be the wrong tool — there isn’t really a Ransack-compatible underlying relation to attach search/sort to.

So we extend DataTable to handle a second case. When the data is already shaped by a service object, the service object becomes responsible for what sort/filter mean, and DataTable just renders the result. The component’s contract becomes:

Pass q: (a Ransack search object) → AR mode. Component uses Ransack for search/sort, expects AR records as rows.
Don’t pass q: → Hash mode. Component expects an array of hashes; sort headers become plain ?sort=...&dir=... links; no search bar.

Same DSL in both modes. The data shape determines the behaviour.

(For full reference on DataTable’s two modes, see Appendix D — DataTable Component.)

A new report page

Module 7’s first new report. The manager already has a Reports section in the sidebar from Module 6; we add one more entry.

In Components::Sidebar’s REPORTS constant:

1
2
3
4
5
6
7


REPORTS = [
  { label: "Choropleth Report",   path: "/reports/choropleth",          icon: :chart_bar,        roles: %w[manager] },
  { label: "SLA Performance",     path: "/reports/sla_performance",     icon: :clock,            roles: %w[manager] },
  { label: "Service Area Detail", path: "/reports/service_area_detail", icon: :magnifying_glass, roles: %w[manager] },
  { label: "Job Density",         path: "/reports/job_density",         icon: :fire,             roles: %w[manager] },
  { label: "FO Travel",           path: "/reports/fo_travel",           icon: :truck,            roles: %w[manager] }
].freeze

:truck for the FO travel idiom — wheels, motion, distance. Substitute if your icon component doesn’t have it; :globe_alt or :map would also work.

In ReportsController:

1
2
3
4
5
6
7


def fo_travel
  rows = FoTravel.call(
    sort: params[:sort],
    dir:  params[:dir]
  )
  @pagy, @rows = pagy_array(rows, items: 25)
end

pagy_array is Pagy’s helper for pre-loaded arrays. Same interface as pagy, just slices the array instead of paginating an AR relation.

Route in config/routes.rb:

1

get "/reports/fo_travel", to: "reports#fo_travel", as: :fo_travel_report

The view at app/views/reports/fo_travel.html.erb:

<div class="h-full flex flex-col">
  <header class="px-6 py-4 border-b border-slate-200">
    <h1 class="text-2xl font-bold text-slate-900">FO Travel</h1>
    <p class="text-sm text-slate-500 mt-1">
      Distance travelled between consecutive jobs by each field officer
      over the last 90 days. Sortable by any column.
    </p>
  </header>

  <div class="flex-1 min-h-0 p-6 overflow-auto">
    <%= render Components::DataTable.new(rows: @rows, pagy: @pagy) do |t| %>
      <% t.column :name,        label: "Field Officer", sort: :name %>
      <% t.column :transitions, label: "Transitions",   sort: :transitions, align: :right %>
      <% t.column :total_km,    label: "Total km",      sort: :total_km,    align: :right %>
      <% t.column :avg_km,      label: "Avg km",        sort: :avg_km,      align: :right %>
    <% end %>
  </div>
</div>

Notice what’s not there: no q:, no t.search. That’s the signal to DataTable that this is hash-shaped data, and it switches to its plain-server-side mode automatically.

Updating DataTable

The component needs a small extension to handle hash-mode rows. The core changes:

Read column values from hashes (row[:key]) when the row is a Hash, otherwise from AR (row.public_send(:column)).
Render sort headers as plain ?sort=...&dir=... links when q: is nil; otherwise use Ransack’s sort_link.
Skip the search bar when q: is nil (search needs Ransack).

The full updated component is in app/components/data_table.rb. The key new method is the value-reading helper:

1
2
3
4
5
6
7
8
9


# Extract a column's value from a row. Handles both AR and Hash
# rows transparently.
def value_for(row, name)
  if row.is_a?(Hash)
    row[name] || row[name.to_s]
  else
    row.public_send(name)
  end
end

And the sort link rendering branches on @q.nil?:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


def render_th(col)
  align_class = col[:align] == :right ? "text-right" : "text-left"
  th(class: "px-4 py-2.5 font-medium #{align_class}", scope: "col") do
    if col[:sort] && @q
      # AR mode: Ransack's sort_link
      plain sort_link(@q, col[:sort], col[:label], ...)
    elsif col[:sort]
      # Hash mode: plain sort link
      render_plain_sort_link(col)
    else
      plain col[:label]
    end
  end
end

The plain-mode sort link generates ?sort=col&dir=direction URLs, toggling direction on repeat clicks of the same column, preserving other request params.

(See Appendix D — DataTable Component for the full component source and usage notes.)

Look what just happened

A table appears showing every FO who completed at least two jobs in the last 90 days. Default sort is by total km descending — the busiest-travelling FOs first. Click the “Total km” header to reverse the sort. Click “Name” to sort alphabetically. The page navigates, the data re-renders with the new sort applied.

Pagination shows 25 FOs per page. The page links along the bottom let you walk through the full set.

The numbers tell a story. Some FOs travel a lot (3000-5000 km over 90 days, busy work in metro areas). Some travel less (quiet rural depots, fewer transitions). The “average per transition” column reveals different operational patterns — a high average suggests jobs spread across geography; a low average suggests jobs clustering close together.

For a real dispatch operation, this report would be a weekly check-in. Trends over time matter — if the average travel per transition is climbing, dispatch is becoming less efficient and that’s a flag worth investigating.

Where this leaves us

The dispatch deck has a fourth report, and the toolbox has a new SQL pattern alongside a sharper sense of when service-shaped data deserves its own rendering path.

Indexed proximity Coverage and territory