Pipeline

mejiro_pipeline

Step 0: Cache PSFs

Generate and cache Roman PSFs in parallel.

This script reads a YAML configuration file specifying parameters for PSF generation, such as oversampling factors, bands, detectors, detector positions, and pixel sizes. It determines which PSFs need to be generated based on existing cache files, and spins up multiple processes to compute and save the missing PSFs as .npy files.

Usage:

python3 _00_cache_psfs.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 1: Run Survey Simulation

Simulates a strong lensing survey and identifies detectable lensing systems using configurable pipelines.

This script simulates a survey and identifies detectable strong lensing systems using the SLSim pipeline plus enhanced capabilities. It reads YAML configuration files for SkyPy, SLHammock (optional), and mejiro, specifying survey parameters and simulation options. The script supports multiprocessing to parallelize survey runs across available CPU cores.

Usage:

python3 _01_run_survey_simulation.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 1a: Generate Galaxy Tables

Pre-generates galaxy population tables for the survey simulation.

This script generates a configurable number of galaxy tables by running SkyPy and SLHammocks pipelines, then serializes them as pickle files. This separates the expensive galaxy population generation from the survey simulation itself, allowing the simulation step (_01b) to load pre-computed tables and skip the initialization.

Each table is generated with a unique random seed for reproducibility. Tables are associated with detectors (for Roman, each table is assigned a detector via round-robin), since SkyPy configs differ per detector.

Usage:

python3 _01a_generate_galaxy_tables.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 1b: Run Survey Simulation (Pre-computed Tables)

Simulates a strong lensing survey using pre-computed galaxy tables from _01a.

This script loads galaxy population tables generated by _01a_generate_galaxy_tables.py, reconstructs the lens population objects, and runs the survey simulation. This avoids the expensive SkyPy/SLHammocks initialization that dominates runtime in the original _01.

Each run is assigned a galaxy table via round-robin (run_index % num_galaxy_tables). Per-run random seeding ensures reproducible, unique draws across runs.

Usage:

python3 _01b_run_survey_simulation.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 2: Build Lens List

Builds a list of StrongLens objects from previously detected lensing systems.

This script processes the output of a strong lensing survey simulation, converting detected lensing systems into mejiro StrongLens objects. It reads configuration parameters from a mejiro configuration YAML file and supports multiple instruments (Roman, HWO). Output lenses are pickled for downstream analysis.

Usage:

python3 _02_build_lens_list.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 3: Generate Subhalo Realizations

Step 4: Create Synthetic Images (Ray-Shooting)

Generates synthetic images: idealized images with no noise or detector effects (optionally, convolved with PSF).

This script creates synthetic images for each lensing system identified in previous pipeline steps, using instrument-specific parameters and PSF models. It reads a YAML configuration file specifying survey, instrument, and image simulation options. Multiprocessing is used to parallelize image generation across available CPU cores.

Usage:

python3 _04_create_synthetic_images.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 5: Create Exposures

Generates exposures from synthetic images, i.e., apply sky background and detector effects to idealized images.

This script processes synthetic images produced in previous pipeline steps, generating exposures for each lensing system using instrument-specific parameters and simulation engines. It reads a mejiro YAML configuration file specifying exposure options. Multiprocessing is used to parallelize exposure creation across available CPU cores.

Usage:

python3 _05_create_exposures.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.

Step 5 (Alternative): Create Exposures (romanisim)

Runs romanisim detector simulation on tiled synthetic images.

This script tiles SyntheticImage pickles into 4088x4088 detector arrays, runs romanisim to apply detector effects, and extracts individual cutouts. Systems are processed in batches of 3136 (56x56 grid of 73x73 tiles) until all systems for each SCA/band are complete. Multiprocessing is used to parallelize batch processing.

Usage:

python3 romanisim_pipeline.py –config <config.yaml> [–resume]

Arguments:

–config: Path to the YAML configuration file. –resume: Preserve existing output and skip already-completed batches (those with a

batch_complete_*.txt sentinel). Default is to delete existing output and rebuild from scratch.

mejiro.pipeline.romanisim_pipeline.exposure_cutout_name(input_path)[source]

Derive the Exposure .npy cutout filename from a SyntheticImage input path.

Handles both full .pkl pickles and lightweight .npz inputs by stripping whatever extension is present before appending .npy (avoids .npz.npy).

Calculate SNRs

Calculates signal-to-noise ratios (SNRs) for simulated exposures.

This script computes SNR values for each lensing system processed in previous pipeline steps and saves name-SNR pairs for downstream filtering or analysis. It reads a YAML configuration file specifying SNR calculation parameters and supports both sequential and parallel processing modes.

Usage:

python3 calculate_snrs.py –config <config.yaml> [–sequential] [–resume]

Arguments:

–config: Path to the YAML configuration file. –sequential: Run in sequential mode instead of parallel. –resume: Preserve existing output and skip already-completed items. Default is to delete and rebuild from scratch.

Step 6: Export Dataset

Exports processed exposures and synthetic images to HDF5 format.

This script reads exposures and synthetic images generated in previous pipeline steps, and writes them to an HDF5 file with relevant metadata and attributes for each lensing system and band. It also optionally exports PSF data if configured. The script reads a mejiro YAML configuration file specifying dataset options.

Usage:

python3 _06_h5_export.py –config <config.yaml> [–data_dir <output_dir>]

Arguments:

–config: Path to the YAML configuration file. –data_dir: Optional override for the data directory specified in the config file.