The previous post built an adaptive frontier: 32,740 non-overlapping H3 cells, each holding at most 10,000 places, covering 72.8M places globally. Those cells are work units ready to queue.
Now the question: how do you split those cells across N parallel workers without overlap?
The simplest possible scheme:
worker_id = h3_cell_index % N
Worker i processes every cell where cell::bigint % N = i. No coordination, no distributed lock, no queue churn. Each worker is fully independent and restartable.
But does N matter? The initial instinct was to reach for primes — they're the canonical "safe" choice for hash distributions. The benchmark confirmed something, but not what I expected: the real distinction isn't prime vs composite at all.
Setup
We use places_h3_t10000, the adaptive frontier at threshold 10,000: 32,740 cells at resolutions 1–9, covering 72,783,221 places. Worker assignment in Postgres:
cell::bigint % {workers_num} AS worker_id
H3 cell indexes have bit 63 = 0 (reserved, always zero for valid cells), so the bigint is non-negative. Worker IDs are always in [0, N).
We tested 14 values of N: primes and composites, small and large, even and odd:
primes: 2, 3, 5, 7, 13, 29
composites: 4, 6, 8, 15, 16, 21, 25, 35
The odd composites (15=3×5, 21=3×7, 25=5², 35=5×7) are the key addition — they let us separate "composite" from "even" as the actual failure condition.
The validation check for each N: do all IDs from 0 to N-1 get at least one cell?
WITH per_worker AS (
SELECT cell::bigint % {workers_num} AS worker_id
FROM places_h3_t10000
GROUP BY worker_id
),
expected AS (
SELECT generate_series(0, {workers_num} - 1) AS worker_id
)
SELECT
count(e.worker_id) AS expected_workers,
count(p.worker_id) AS active_workers,
count(e.worker_id) - count(p.worker_id) AS idle_workers,
CASE WHEN count(p.worker_id) = count(e.worker_id) THEN 'PASS' ELSE 'FAIL' END AS check_result,
array_agg(e.worker_id ORDER BY e.worker_id) FILTER (WHERE p.worker_id IS NULL) AS idle_ids
FROM expected e
LEFT JOIN per_worker p ON e.worker_id = p.worker_id
Results
| N | type | active | idle | check | cv_places | imbalance_ratio |
|---|---|---|---|---|---|---|
| 2 | prime | 1 | 1 | FAIL | — | — |
| 3 | prime | 3 | 0 | PASS | 0.0083 | 1.006 |
| 4 | composite | 1 | 3 | FAIL | — | — |
| 5 | prime | 5 | 0 | PASS | 0.0140 | 1.015 |
| 6 | composite | 3 | 3 | FAIL | — | — |
| 7 | prime | 7 | 0 | PASS | 0.0105 | 1.011 |
| 8 | composite | 1 | 7 | FAIL | — | — |
| 13 | prime | 13 | 0 | PASS | 0.0264 | 1.037 |
| 15 | composite | 15 | 0 | PASS | 0.0314 | 1.048 |
| 16 | composite | 1 | 15 | FAIL | — | — |
| 21 | composite | 21 | 0 | PASS | 0.0195 | 1.030 |
| 25 | composite | 25 | 0 | PASS | 0.0348 | 1.065 |
| 29 | prime | 29 | 0 | PASS | 0.0316 | 1.072 |
| 35 | composite | 35 | 0 | PASS | 0.0418 | 1.072 |
cv_places = stddev/mean. imbalance_ratio = max/mean (how much slower the busiest worker runs vs average). Balance metrics only meaningful for passing configs.
Two things jump out: N=2 is prime and fails. N=15, 21, 25, 35 are composites and pass with distribution indistinguishable from primes of the same size.

Per-worker detail: pass vs fail
N=7 (prime, PASS) — all 7 workers receive cells, tightly balanced:
| worker_id | cells | places |
|---|---|---|
| 0 | 4,636 | 10,516,175 |
| 1 | 4,695 | 10,326,536 |
| 2 | 4,673 | 10,235,796 |
| 3 | 4,696 | 10,503,856 |
| 4 | 4,633 | 10,418,204 |
| 5 | 4,723 | 10,477,655 |
| 6 | 4,684 | 10,304,999 |
N=8 (composite, FAIL) — workers 0–6 idle; worker 7 handles everything:
| worker_id | cells | places |
|---|---|---|
| 0 | 0 | 0 |
| … | 0 | 0 |
| 7 | 32,740 | 72,783,221 |
N=6 (composite, FAIL) — only odd IDs receive work:
| worker_id | cells | places |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 10,972 | 24,338,838 |
| 2 | 0 | 0 |
| 3 | 10,920 | 24,031,744 |
| 4 | 0 | 0 |
| 5 | 10,848 | 24,412,639 |
N=21 = 3×7 (composite, PASS) — all 21 workers active, even spread:
| worker_id | cells | places |
|---|---|---|
| 0 | 1,521 | 3,476,285 |
| 1 | 1,553 | 3,424,652 |
| 2 | 1,540 | 3,479,906 |
| … | ~1,560 | ~3,470,000 |
| 20 | 1,555 | 3,436,357 |
N=21 (3×7) distributes as cleanly as any prime of the same size. The composite structure doesn't hurt it at all.
Why even N fails
H3 cells are 64-bit integers. The format packs a resolution field and a chain of base-7 digits — each resolution level contributes one 3-bit digit (values 0–6). For a cell at resolution r, positions r+1..15 are filled with the invalid digit sentinel: 7 = 111 in binary.
Our adaptive view has cells at resolutions 1–9, so every cell has at least 6 unused digit positions (levels 10–15), each contributing 3 bits of all-ones — at least 18 bits locked to 1 (more for coarser cells; 18 is the minimum, shared by all). (The sentinel value itself is documented in the H3 spec; the consequence for modulo sharding is not.)
SELECT cell::bigint & ((1::bigint << 18) - 1) AS low18_bits, count(*)
FROM places_h3_t10000
GROUP BY 1;
low18_bits | count
------------+-------
262143 | 32740
All 32,740 cells have identical low 18 bits: 262143 = 2^18 − 1. Every H3 cell is an odd number.
This is why even N fails — but the failure goes deeper than parity. For N = 2^k (a power of two), cell % N reads only the low k bits, which are all fixed to 1. So every cell maps to exactly one worker ID: 262143 % N. For N=2 → worker 1. For N=8 → worker 7. For N=16 → worker 15. Workers 0–6 are idle for N=8 not because they're even-numbered, but because every cell lands on residue 7.
The general rule follows from gcd: for N = 2^k × m (m odd), gcd(2^18, N) = 2^k, so the variable part of the cell index (bits 18 and above) can produce only N / 2^k = m distinct residues. Effective workers = odd part of N = m.
| N | odd part | active |
|---|---|---|
| 4 = 4×1 | 1 | 1 |
| 6 = 2×3 | 3 | 3 |
| 8 = 8×1 | 1 | 1 |
| 16 = 16×1 | 1 | 1 |
| 15 = 1×15 | 15 | 15 |
| 21 = 1×21 | 21 | 21 |
| 35 = 1×35 | 35 | 35 |
Odd composites have odd part = themselves, so they fully activate all workers — same as primes.
Balance: composites match primes
For passing configs at comparable sizes:
| N | type | imbalance_ratio | cv_places |
|---|---|---|---|
| 13 | prime | 1.037 | 0.0264 |
| 15 | composite (3×5) | 1.048 | 0.0314 |
| 21 | composite (3×7) | 1.030 | 0.0195 |
| 25 | composite (5²) | 1.065 | 0.0348 |
| 29 | prime | 1.072 | 0.0316 |
| 35 | composite (5×7) | 1.072 | 0.0418 |
N=21 is actually more balanced than N=29, despite being composite. Once N is odd, the factorisation doesn't drive the imbalance — the geographic distribution of places across the globe does, and that's beyond the control of N.
The practical rule: use an odd N ≥ 3. Primes are the safe, no-thought choice. Odd composites work equally well — if your infrastructure gives you 21 workers naturally (say, 3 nodes × 7 processes), don't feel you need to round to the nearest prime.
Composing across two dimensions: workers × time
Once you have an odd prime N for spatial sharding, you can add a second independent dimension — say, a daily batch — using another prime.
-- p = 7 workers, q = 29 day-cycle
-- Day d (0..28), Worker w (0..6)
WHERE cell::bigint % 29 = {day}
AND cell::bigint % 7 = {worker}
Two separate modulo conditions. No join, no coordination. Each cell either satisfies both or it doesn't.
This works because 7 and 29 are coprime (gcd = 1). By the Chinese Remainder Theorem, the pair (cell % 29, cell % 7) is uniformly distributed across all 29 × 7 = 203 sub-shards. Each worker on each day gets ~161 cells (32,740 / 203) and ~358K places (72.8M / 203).
Why primes matter here more than just being odd. Two odd composites can share a factor. If you used N_days=15 (3×5) and N_workers=21 (3×7), gcd = 3. Then cell % 15 = 0 AND cell % 21 = 1 has no solution — % 15 = 0 forces cell % 3 = 0, but % 21 = 1 forces cell % 3 = 1. Some (day, worker) pairs get zero cells. The affected pairs aren't random — they follow a predictable pattern, meaning entire batches silently vanish.
Distinct primes are always coprime. Use primes for any dimension you want to compose independently, and pick different primes per dimension:
| dimension | prime | combined shards |
|---|---|---|
| workers only | 29 | 29 |
| workers × daily | 29 × 7 | 203 |
| workers × daily × weekly sweep | 29 × 7 × 11 | 2,233 |
Each additional prime multiplies the granularity without breaking any existing dimension's distribution. A cell's position in a 3D prime grid is fully determined by (cell % 29, cell % 7, cell % 11), and each coordinate is independent.
Alternative: sharding on place ID instead of H3 cell
The same modulo pattern works on the primary key — place_id::bigint % N = worker_id — and sidesteps the odd-N constraint entirely. Sequential and random IDs don't share H3's sentinel bit pattern, so any N distributes cleanly.
The tradeoff is proximity. Sharding by H3 cell keeps geographically close places on the same worker. That matters for tasks that need to relate nearby places: duplicate detection, geocode QA, brand clustering, ML features using spatial neighbors. Splitting by ID scatters them randomly, so cross-boundary pair detection becomes a cross-worker coordination problem instead of a local gridDisk lookup.
If each place can be processed independently — normalization, enrichment, format conversion — ID-based sharding is simpler. If spatial locality matters, stay with H3 cells and the gridDisk overlap pattern from the previous post.
The worker pattern
import os
import psycopg2
WORKERS_NUM = int(os.environ["WORKERS_NUM"]) # odd, ≥ 3
WORKER_ID = int(os.environ["WORKER_ID"]) # 0 .. WORKERS_NUM-1
conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("""
SELECT cell, place_count
FROM places_h3_t10000
WHERE cell::bigint %% %(n)s = %(id)s
ORDER BY place_count DESC
""", {"n": WORKERS_NUM, "id": WORKER_ID})
for cell, place_count in cur:
process_cell(cell, place_count)
No shared state. Restart a failed worker with the same env vars and it processes the identical set of cells.
With boundary overlap
for cell, place_count in my_cells:
candidates = load_places_in(conn, [cell] + h3.grid_disk(cell, 1))
results = find_matches(candidates)
emit_home_only(results, cell)
Conclusion
The real distinction isn't prime vs composite — it's odd vs even.
effective workers = odd part of N
- Odd N (prime or composite): all IDs 0..N-1 receive cells, imbalance < 10% for N ≤ 35
- Even N: all cells land on one residue class — you get only odd_part(N) active workers, and which IDs are active depends on N, not simply on parity
- N=2: fails even though it's prime — the only even prime is not exempt
- Odd composites (15, 21, 25, 35): pass and balance as well as primes of the same size
- Composing dimensions (workers × time): use distinct primes per axis — coprime modulos are independent by CRT, sub-shards stay uniform. Composites sharing a factor silently empty some (day, worker) pairs
When sizing your worker pool, pick any odd number ≥ 3. If you plan to add a time dimension later (daily batches, weekly sweeps), use primes so each axis stays coprime and composable.
Benchmark code: overturemaps-pg/pages/h3-parallel-workers. Full results: results_2026-05-18_220518.md. PostgreSQL 17, PostGIS 3.5, h3-pg 4.x, Overture Maps dataset (72.8M places).