Getting Started =============== Installation ------------ Install the required Python packages:: pip install numpy healpy pixell numba pyyaml psutil ``psutil`` is optional but recommended — it enables automatic CPU and memory detection on both local machines and HPC clusters. Quick Start ----------- 1. **Copy and edit the config**:: cp config.yaml config_local.yaml $EDITOR config_local.yaml At minimum set: * ``FOLDER_SCAN`` — directory with ``theta_N.npy`` / ``phi_N.npy`` / ``psi_N.npy`` scan files. * ``FOLDER_TOD_OUTPUT`` — where output ``tod_day_N.npy`` files are written. * ``path_to_map`` — HEALPix FITS file containing I, Q, U. * ``FOLDER_BEAM`` and ``beam_file_I/Q/U`` — beam FITS files. 2. **(Optional) Pre-compute the beam rotation cache**:: python precompute_beam_cache.py --n_psi 720 This eliminates one or both Rodrigues rotations per sample at runtime, yielding roughly a 25 % speed-up. However, the psi-roll is evaluated on a discrete grid rather than continuously, which introduces a small interpolation error. **Not recommended for experiments requiring high precision.** If you do use caching, set ``beam_cache_dir`` in your config to the output directory. 3. **Run the pipeline**:: python sample_based_tod_generation_gridint.py On first run the pipeline measures throughput at several batch sizes and process counts, writes the optimal values to the config, and processes all days. Subsequent runs skip calibration automatically. Running on HPC / SLURM ----------------------- The pipeline is SLURM-aware. Set ``--cpus-per-task`` and ``--mem`` in your job script; calibration will find the best ``n_processes`` and ``batch_size`` for the allocated resources:: #!/bin/bash #SBATCH --ntasks=1 #SBATCH --cpus-per-task=32 #SBATCH --mem=128G python sample_based_tod_generation_gridint.py On memory-constrained nodes the optimal process count is often *fewer* than the total allocated CPUs — the calibration captures this correctly. Output Files ------------ One ``.npy`` file is written per processing batch (the filename uses a *day* index by convention, but the index can represent any batching unit you choose):: FOLDER_TOD_OUTPUT/tod_day_0.npy FOLDER_TOD_OUTPUT/tod_day_1.npy ... Each file has shape ``(3, n_samples)`` and dtype ``float32``. Axis 0 is the Stokes component: ``[I, Q, U]``.