Beam Pixel Clustering & Calibration

This page explains how the beam pixel clustering calibration works, defines the B_ℓ divergence quality metric, and gives guidance on choosing clustering_error_threshold.

Overview

Spatial k-means clustering on the unit sphere reduces the number of effective beam pixels before TOD generation. Only the low-power tail of the beam is clustered (controlled by beam_cluster_tail_fraction); the bright main-lobe pixels are kept pixel-exact. The gain is a proportional speed-up in the innermost Numba gather loops with a small, controllable accuracy loss.

Note

Clustering is applied only to the TOD-generation path. The beam transfer function B_ℓ must always be computed from the full, unclustered beam pixel set. Legendre polynomial oscillations on scales ~π/ℓ are destroyed by pixel merging, so any B_ℓ computation must bypass this step.

How the Calibration Works

When clustering_calibration_enabled: true is set, the pipeline sweeps a (tail_fraction × n_clusters) grid. For each candidate pair it:

Clusters a copy of the beam pixels using the candidate parameters.
Computes the beam transfer function B_ℓ from the clustered centroids (power_cut = 1.0).
Computes the reference B_ℓ from the full unclustered beam (computed once, reused for all grid points).
Measures the relative RMS B_ℓ divergence (see below).
Records the pixel-count speedup as S / K_out.

The pair that achieves the highest speedup while keeping B_ℓ divergence below clustering_error_threshold is written to the config. If no pair qualifies, the pair with the lowest divergence is used with a warning.

No scan data or TOD generation is performed during calibration — the metric depends only on beam geometry, making the sweep fast.

B_ℓ Divergence Metric

The quality of a (tail_fraction, n_clusters) pair is quantified by:

\[\varepsilon_{B_\ell} = \frac{\mathrm{RMS}_\ell\!\left(B_\ell^{\mathrm{clust}} - B_\ell^{\mathrm{ref}}\right)} {\mathrm{RMS}_\ell\!\left(B_\ell^{\mathrm{ref}}\right)}\]

where:

\(B_\ell^{\mathrm{ref}}\) is the beam transfer function computed from the full unclustered pixel set with power_cut = 1.0.
\(B_\ell^{\mathrm{clust}}\) is the beam transfer function computed from the centroid pixels produced by the candidate pair.
The RMS is taken over multipoles \(\ell = 0 \ldots \ell_{\max}\), where \(\ell_{\max}\) defaults to 2 × nside of the sky map (or 500 if no map is available).

Why B_ℓ divergence rather than TOD error?

No scan data needed. The metric is computed from beam geometry alone, so the calibration sweep is fast and can be run independently of the observation schedule.
Direct beam fidelity. B_ℓ controls how the beam couples to each angular scale of the sky. A clustering that faithfully reproduces B_ℓ will also reproduce the TOD accurately, because TOD errors ultimately arise from beam-shape distortions that are captured in B_ℓ.

Calibration Output Table

When the calibration runs it prints an ASCII table of the form:

[clust_calib] error_threshold=1.0e-05
 tail%      K   K_out   speedup   B_ell div  status
--------------------------------------------------------
  0.5%     10      10      1.00   1.23e-07  ✓
  0.5%     20      20      1.00   1.23e-07  ✓
  ...
  5.0%    500     487      2.63   8.41e-06  ✓
  5.0%   1000     912      3.11   3.92e-06  ✓
--------------------------------------------------------

[clust_calib] Recommendation: tail_fraction=0.05, n_clusters=1000
  (speedup=3.11x, B_ell div=3.92e-06)

Columns:

tail% — fraction of total beam power in the clustered tail.
K — requested number of tail clusters.
K_out — actual number of output pixels (n_main + K_tail).
speedup — ratio S / K_out where S is the original pixel count.
B_ell div — \(\varepsilon_{B_\ell}\) for this pair.
status — ✓ if B_ell div ≤ clustering_error_threshold, ✗ otherwise.

Choosing `clustering_error_threshold`

The threshold controls the strictness of the B_ℓ fidelity requirement. Lower values preserve more of the beam shape but allow less aggressive clustering (smaller speedup).

Precision tier	`clustering_error_threshold`	Notes
Conservative (default)	`1.0e-5`	Safe for science-grade pipelines. Typical speedup 2–4× for a 5 % tail with 500–1000 clusters.
Moderate	`1.0e-4`	Suitable for survey-speed optimisation where a small B_ℓ bias is acceptable. Allows more aggressive tail truncation.
Loose / exploratory	`1.0e-3`	Useful for rapid prototyping. The B_ℓ shape may be visibly distorted at high ℓ.

Practical notes:

Start with the default (1.0e-5) and inspect the calibration table. If all grid points pass, you can relax the threshold to gain more speedup, it’s an interplay between the noise level and the accuracy of B_ℓ characterization; if none pass, tighten the tail_fraction range or increase n_clusters.
Interpolation errors (see Beam Interpolation Accuracy) set a separate noise floor on sky-map lookups and are independent of this threshold. There is no strict relationship between the two metrics; they should be chosen independently.
B_ℓ divergence is independent of scan strategy — it depends only on beam geometry and the clustering parameters.