synthesizer.pipeline.pipeline_io¶
A module for handling I/O operations in the pipeline.
This module contains classes and functions for reading and writing data in the pipeline. This includes reading and writing HDF5 files, as well as handling the MPI communications for parallel I/O operations.
Example usage:
# Write data to an HDF5 file writer = PipelineIO(“output.hdf5”) writer.write_data(data, key)
Classes
- class synthesizer.pipeline.pipeline_io.PipelineIO(filepath, comm=None, ngalaxies_local=None, start_time=None, verbose=1, parallel_io=False)[source]¶
A class for writing data to an HDF5 file.
This class provides methods for writing data to an HDF5 file. It can handle writing data in parallel using MPI if the h5py library has been built with parallel support.
- hdf¶
The HDF5 file to write to.
- Type:
h5py.File
- comm¶
The MPI communicator.
- Type:
mpi.Comm
- num_galaxies¶
The total number of galaxies.
- Type:
int
- rank¶
The rank of the MPI process.
- Type:
int
- is_parallel¶
Whether the writer is running in parallel.
- Type:
bool
- is_root¶
Whether the writer is running on the root process.
- Type:
bool
- is_collective¶
Whether the writer is running in collective mode.
- Type:
bool
- verbose¶
Whether to print verbose output.
- Type:
bool
- _start_time¶
The start time of the pipeline.
- Type:
float
- combine_rank_files()[source]¶
Combine the rank files into a single file.
- Parameters:
output_file (str) – The name of the output file.
- combine_rank_files_virtual()[source]¶
Combine the rank files into a single virtual file.
Note that the virtual file this produces requires the rank files to remain in the same location as when they were created.
- create_datasets_parallel(data, key)[source]¶
Create datasets ready to be populated in parallel.
This is only needed for collective I/O operations. We will first make the datasets here in serial so they can be written to in any order on any rank.
- Parameters:
shapes (dict) – The shapes of the datasets to create.
dtypes (dict) – The data types of the datasets to create.
- create_file_with_metadata(instruments, emission_model)[source]¶
Write metadata to the HDF5 file.
This writes useful metadata to the root group of the HDF5 file and outputs the instruments and emission model to the appropriate groups.
- Parameters:
instruments (dict) – A dictionary of instrument objects.
emission_model (dict) – A dictionary of emission model objects.
- write_data(data, key, indexes=None, root=0)[source]¶
Write data using the appropriate method based on the environment.
- Parameters:
data (any) – The data to write.
key (str) – The key to write the data to.
root (int, optional) – The root rank for gathering and writing.
- write_dataset(data, key)[source]¶
Write a dataset to an HDF5 file.
We handle various different cases here: - If the data is a unyt object, we write the value and units. - If the data is a string we’ll convert it to a h5py compatible string
and write it with dimensionless units.
If the data is a numpy array, we write the data and set the units to “dimensionless”.
- Parameters:
data (any) – The data to write.
key (str) – The key to write the data to.
- write_dataset_parallel(data, key)[source]¶
Write a dataset to an HDF5 file in parallel.
This function requires that h5py has been built with parallel support.
- Parameters:
data (any) – The data to write.
key (str) – The key to write the data to.