tod_calibrate

Empirical batch-size and process-count calibration.

The calibration runs once before the day loop, measures sustained throughput over a range of candidate configurations using interleaved timing repeats (to avoid L3 cache warm-up bias), and writes the optimal values to the config file for subsequent runs. All calibration functions are internal helpers (prefixed with _) and are not part of the public API.

Runtime calibration for tod_exact_gen_batched.

Calibrates three knobs jointly:
  • n_processes (P) — worker processes (parallel over days)

  • numba_threads (T) — threads per worker (parallel over batch via prange)

  • batch_size (B) — samples per fused-kernel invocation

The fused kernel (a1d5d36) parallelises the entire Rodrigues+gather over prange(B). Spawning P workers each using T = NUMBA_NUM_THREADS_DEFAULT (=all cores) oversubscribes by P×; the new search enforces P*T ≤ N_cores.

Strategy (~30s wall time):

Phase A — single-process throughput vs T at a fixed B. Phase B — for best T, sweep B around 16×T to land on the throughput plateau. Phase C — enumerate (P, T) with P*T ≤ N_cores; pick max P × tp(T).

Memory budget per process must accommodate mp_stacked (already in shared memory, but counted defensively) plus transient per-batch buffers.

Beam clustering calibration is unchanged (driven by science accuracy, not speed).

tod_calibrate.calibrate_runtime(beam_data, folder_scan, probe_day, mp, n_cpu_ceiling, max_processes_user, interp_mode='bilinear', prefix='', center_idx=None, z_skip_threshold=-1.0)[source]

Joint (n_processes, numba_threads, batch_size) calibration.

Parameters:
  • beam_data – from prepare_beam_data (after clustering, with mp_stacked).

  • folder_scan – scan directory.

  • probe_day – any valid day index for probe data.

  • mp – list of sky-map components.

  • n_cpu_ceiling – hard ceiling from scheduler/affinity (_get_ncpus()).

  • max_processes_user – user-configured n_processes (laptop cap, etc). Acts as an upper bound on P.

  • interp_mode – ‘nearest’ or ‘bilinear’.

  • prefix – log prefix.

Returns:

(n_processes, n_threads, batch_size)

tod_calibrate.calibrate_beam_clustering(beam_data, folder_scan=None, probe_day=None, mp=None, error_threshold=0.001, bell_lmax=None, interp_mode='bilinear')[source]

Find (tail_fraction, n_clusters) maximising speedup s.t. B_ell error ≤ error_threshold.

Computes reference B_ell (power_cut=1.0) from unclustered beam, then sweeps a fixed (tail_fraction × n_clusters) grid. The pair maximising speedup with relative-RMS B_ell divergence ≤ error_threshold wins; if no pair qualifies, the minimum-divergence pair is returned with a warning.