The previous post built an adaptive frontier: 32,740 non-overlapping H3 cells, each holding at most 10,000 places, covering 72.8M places globally. Those cells are work units ready to queue.

Now the question: how do you split those cells across N parallel workers without overlap?

The simplest possible scheme:

worker_id = h3_cell_index % N

Worker i processes every cell where cell::bigint % N = i. No coordination, no distributed lock, no queue churn. Each worker is fully independent and restartable.

But does N matter? The initial instinct was to reach for primes — they're the canonical "safe" choice for hash distributions. The benchmark confirmed something, but not what I expected: the real distinction isn't prime vs composite at all.

Setup

We use places_h3_t10000, the adaptive frontier at threshold 10,000: 32,740 cells at resolutions 1–9, covering 72,783,221 places. Worker assignment in Postgres:

cell::bigint % {workers_num} AS worker_id

H3 cell indexes have bit 63 = 0 (reserved, always zero for valid cells), so the bigint is non-negative. Worker IDs are always in [0, N).

We tested 14 values of N: primes and composites, small and large, even and odd:

primes:     2, 3, 5, 7, 13, 29
composites: 4, 6, 8, 15, 16, 21, 25, 35

The odd composites (15=3×5, 21=3×7, 25=5², 35=5×7) are the key addition — they let us separate "composite" from "even" as the actual failure condition.

The validation check for each N: do all IDs from 0 to N-1 get at least one cell?

WITH per_worker AS (
  SELECT cell::bigint % {workers_num} AS worker_id
  FROM places_h3_t10000
  GROUP BY worker_id
),
expected AS (
  SELECT generate_series(0, {workers_num} - 1) AS worker_id
)
SELECT
  count(e.worker_id)                                                             AS expected_workers,
  count(p.worker_id)                                                             AS active_workers,
  count(e.worker_id) - count(p.worker_id)                                        AS idle_workers,
  CASE WHEN count(p.worker_id) = count(e.worker_id) THEN 'PASS' ELSE 'FAIL' END  AS check_result,
  array_agg(e.worker_id ORDER BY e.worker_id) FILTER (WHERE p.worker_id IS NULL)  AS idle_ids
FROM expected e
LEFT JOIN per_worker p ON e.worker_id = p.worker_id

Results

N type active idle check cv_places imbalance_ratio
2 prime 1 1 FAIL
3 prime 3 0 PASS 0.0083 1.006
4 composite 1 3 FAIL
5 prime 5 0 PASS 0.0140 1.015
6 composite 3 3 FAIL
7 prime 7 0 PASS 0.0105 1.011
8 composite 1 7 FAIL
13 prime 13 0 PASS 0.0264 1.037
15 composite 15 0 PASS 0.0314 1.048
16 composite 1 15 FAIL
21 composite 21 0 PASS 0.0195 1.030
25 composite 25 0 PASS 0.0348 1.065
29 prime 29 0 PASS 0.0316 1.072
35 composite 35 0 PASS 0.0418 1.072

cv_places = stddev/mean. imbalance_ratio = max/mean (how much slower the busiest worker runs vs average). Balance metrics only meaningful for passing configs.

Two things jump out: N=2 is prime and fails. N=15, 21, 25, 35 are composites and pass with distribution indistinguishable from primes of the same size.

Los Angeles area split across 3 workers. Each color is one worker's cells — non-overlapping by construction. Dense urban core subdivides into finer hexes; sparse outskirts stay coarse.
Los Angeles, N=3 workers. Each color is one worker's slice. No overlap, no coordination — the H3 index modulo does all the work.

Per-worker detail: pass vs fail

N=7 (prime, PASS) — all 7 workers receive cells, tightly balanced:

worker_id cells places
0 4,636 10,516,175
1 4,695 10,326,536
2 4,673 10,235,796
3 4,696 10,503,856
4 4,633 10,418,204
5 4,723 10,477,655
6 4,684 10,304,999

N=8 (composite, FAIL) — workers 0–6 idle; worker 7 handles everything:

worker_id cells places
0 0 0
0 0
7 32,740 72,783,221

N=6 (composite, FAIL) — only odd IDs receive work:

worker_id cells places
0 0 0
1 10,972 24,338,838
2 0 0
3 10,920 24,031,744
4 0 0
5 10,848 24,412,639

N=21 = 3×7 (composite, PASS) — all 21 workers active, even spread:

worker_id cells places
0 1,521 3,476,285
1 1,553 3,424,652
2 1,540 3,479,906
~1,560 ~3,470,000
20 1,555 3,436,357

N=21 (3×7) distributes as cleanly as any prime of the same size. The composite structure doesn't hurt it at all.

Why even N fails

H3 cells are 64-bit integers. The format packs a resolution field and a chain of base-7 digits — each resolution level contributes one 3-bit digit (values 0–6). For a cell at resolution r, positions r+1..15 are filled with the invalid digit sentinel: 7 = 111 in binary.

Our adaptive view has cells at resolutions 1–9, so every cell has at least 6 unused digit positions (levels 10–15), each contributing 3 bits of all-ones — at least 18 bits locked to 1 (more for coarser cells; 18 is the minimum, shared by all). (The sentinel value itself is documented in the H3 spec; the consequence for modulo sharding is not.)

SELECT cell::bigint & ((1::bigint << 18) - 1) AS low18_bits, count(*)
FROM places_h3_t10000
GROUP BY 1;
 low18_bits | count
------------+-------
     262143 | 32740

All 32,740 cells have identical low 18 bits: 262143 = 2^18 − 1. Every H3 cell is an odd number.

This is why even N fails — but the failure goes deeper than parity. For N = 2^k (a power of two), cell % N reads only the low k bits, which are all fixed to 1. So every cell maps to exactly one worker ID: 262143 % N. For N=2 → worker 1. For N=8 → worker 7. For N=16 → worker 15. Workers 0–6 are idle for N=8 not because they're even-numbered, but because every cell lands on residue 7.

The general rule follows from gcd: for N = 2^k × m (m odd), gcd(2^18, N) = 2^k, so the variable part of the cell index (bits 18 and above) can produce only N / 2^k = m distinct residues. Effective workers = odd part of N = m.

N odd part active
4 = 4×1 1 1
6 = 2×3 3 3
8 = 8×1 1 1
16 = 16×1 1 1
15 = 1×15 15 15
21 = 1×21 21 21
35 = 1×35 35 35

Odd composites have odd part = themselves, so they fully activate all workers — same as primes.

Balance: composites match primes

For passing configs at comparable sizes:

N type imbalance_ratio cv_places
13 prime 1.037 0.0264
15 composite (3×5) 1.048 0.0314
21 composite (3×7) 1.030 0.0195
25 composite (5²) 1.065 0.0348
29 prime 1.072 0.0316
35 composite (5×7) 1.072 0.0418

N=21 is actually more balanced than N=29, despite being composite. Once N is odd, the factorisation doesn't drive the imbalance — the geographic distribution of places across the globe does, and that's beyond the control of N.

The practical rule: use an odd N ≥ 3. Primes are the safe, no-thought choice. Odd composites work equally well — if your infrastructure gives you 21 workers naturally (say, 3 nodes × 7 processes), don't feel you need to round to the nearest prime.

Composing across two dimensions: workers × time

Once you have an odd prime N for spatial sharding, you can add a second independent dimension — say, a daily batch — using another prime.

-- p = 7 workers, q = 29 day-cycle
-- Day d (0..28), Worker w (0..6)
WHERE cell::bigint % 29 = {day}
  AND cell::bigint % 7  = {worker}

Two separate modulo conditions. No join, no coordination. Each cell either satisfies both or it doesn't.

This works because 7 and 29 are coprime (gcd = 1). By the Chinese Remainder Theorem, the pair (cell % 29, cell % 7) is uniformly distributed across all 29 × 7 = 203 sub-shards. Each worker on each day gets ~161 cells (32,740 / 203) and ~358K places (72.8M / 203).

Why primes matter here more than just being odd. Two odd composites can share a factor. If you used N_days=15 (3×5) and N_workers=21 (3×7), gcd = 3. Then cell % 15 = 0 AND cell % 21 = 1 has no solution — % 15 = 0 forces cell % 3 = 0, but % 21 = 1 forces cell % 3 = 1. Some (day, worker) pairs get zero cells. The affected pairs aren't random — they follow a predictable pattern, meaning entire batches silently vanish.

Distinct primes are always coprime. Use primes for any dimension you want to compose independently, and pick different primes per dimension:

dimension prime combined shards
workers only 29 29
workers × daily 29 × 7 203
workers × daily × weekly sweep 29 × 7 × 11 2,233

Each additional prime multiplies the granularity without breaking any existing dimension's distribution. A cell's position in a 3D prime grid is fully determined by (cell % 29, cell % 7, cell % 11), and each coordinate is independent.

Alternative: sharding on place ID instead of H3 cell

The same modulo pattern works on the primary key — place_id::bigint % N = worker_id — and sidesteps the odd-N constraint entirely. Sequential and random IDs don't share H3's sentinel bit pattern, so any N distributes cleanly.

The tradeoff is proximity. Sharding by H3 cell keeps geographically close places on the same worker. That matters for tasks that need to relate nearby places: duplicate detection, geocode QA, brand clustering, ML features using spatial neighbors. Splitting by ID scatters them randomly, so cross-boundary pair detection becomes a cross-worker coordination problem instead of a local gridDisk lookup.

If each place can be processed independently — normalization, enrichment, format conversion — ID-based sharding is simpler. If spatial locality matters, stay with H3 cells and the gridDisk overlap pattern from the previous post.

The worker pattern

import os
import psycopg2

WORKERS_NUM = int(os.environ["WORKERS_NUM"])  # odd, ≥ 3
WORKER_ID   = int(os.environ["WORKER_ID"])    # 0 .. WORKERS_NUM-1

conn = psycopg2.connect(...)
cur  = conn.cursor()

cur.execute("""
    SELECT cell, place_count
    FROM places_h3_t10000
    WHERE cell::bigint %% %(n)s = %(id)s
    ORDER BY place_count DESC
""", {"n": WORKERS_NUM, "id": WORKER_ID})

for cell, place_count in cur:
    process_cell(cell, place_count)

No shared state. Restart a failed worker with the same env vars and it processes the identical set of cells.

With boundary overlap

for cell, place_count in my_cells:
    candidates = load_places_in(conn, [cell] + h3.grid_disk(cell, 1))
    results    = find_matches(candidates)
    emit_home_only(results, cell)

Conclusion

The real distinction isn't prime vs composite — it's odd vs even.

effective workers = odd part of N

When sizing your worker pool, pick any odd number ≥ 3. If you plan to add a time dimension later (daily batches, weekly sweeps), use primes so each axis stays coprime and composable.


Benchmark code: overturemaps-pg/pages/h3-parallel-workers. Full results: results_2026-05-18_220518.md. PostgreSQL 17, PostGIS 3.5, h3-pg 4.x, Overture Maps dataset (72.8M places).