ftag.hdf5.h5utils#

Functions#

compare_groups(g1, g2[, path])

Recursively compare two h5py.Groups or in-memory dicts.

write_group_full(h5group, data)

Write a nested dictionary structure to an HDF5 group.

extract_group_full(→ dict)

Extract the full contents of an HDF5 group into a nested dictionary.

get_dtype(→ numpy.dtype)

Return a dtype based on an existing dataset and requested variables.

cast_dtype(→ numpy.dtype)

Cast float type to half or full precision.

join_structured_arrays(arrays)

Join a list of structured numpy arrays.

structured_from_dict(→ numpy.ndarray)

Convert a dict to a structured array.

Module Contents#

ftag.hdf5.h5utils.compare_groups(g1: h5py.Group | dict, g2: h5py.Group | dict, path: str = '')#

Recursively compare two h5py.Groups or in-memory dicts.

Parameters:
  • g1 (h5py.Group | dict) – First group or dict to compare

  • g2 (h5py.Group | dict) – Second group or dict to compare

  • path (str, optional) – Path to the current group, by default “”

Raises:

TypeError – If the types of the items do not match

ftag.hdf5.h5utils.write_group_full(h5group: h5py.Group, data: dict)#

Write a nested dictionary structure to an HDF5 group.

This function recursively writes a dictionary containing datasets and subgroups to an HDF5 group. The dictionary should have the structure created by extract_group_full().

Parameters:
  • h5group (h5py.Group) – The HDF5 group to write data to

  • data (dict) – Dictionary containing the data structure to write. Can contain: - ‘_group_attrs’: dict of group-level attributes - dataset entries: dict with ‘data’ and ‘attrs’ keys - subgroup entries: nested dictionaries

Raises:

TypeError – If an unexpected value type is encountered in the data dict

ftag.hdf5.h5utils.extract_group_full(group: h5py.Group) dict#

Extract the full contents of an HDF5 group into a nested dictionary.

This function recursively extracts all datasets, subgroups, and attributes from an HDF5 group into an in-memory dictionary structure. Group-level attributes are stored under the ‘_group_attrs’ key.

Parameters:

group (h5py.Group) – The HDF5 group to extract data from

Returns:

Nested dictionary containing: - ‘_group_attrs’: dict of group-level attributes (if any) - dataset entries: dict with ‘data’ (array) and ‘attrs’ (dict) keys - subgroup entries: nested dictionaries with same structure

Return type:

dict

Raises:

TypeError – If an unsupported HDF5 item type is encountered

ftag.hdf5.h5utils.get_dtype(ds: h5py.Dataset, variables: list[str] | None = None, precision: str | None = None, transform: ftag.transform.Transform | None = None, full_precision_vars: list[str] | None = None) numpy.dtype#

Return a dtype based on an existing dataset and requested variables.

Parameters:
  • ds (h5py.Dataset) – Input h5 dataset

  • variables (list[str] | None, optional) – List of variables to include in dtype, by default None

  • precision (str | None, optional) – Precision to cast floats to, “half” or “full”, by default None

  • transform (Transform | None, optional) – Transform to apply to variables names, by default None

  • full_precision_vars (list[str] | None, optional) – List of variables to keep in full precision, by default None

Returns:

Output dtype

Return type:

np.dtype

Raises:

ValueError – If variables are not found in dataset

ftag.hdf5.h5utils.cast_dtype(typestr: str, precision: str) numpy.dtype#

Cast float type to half or full precision.

Parameters:
  • typestr (str) – Input type string

  • precision (str) – Precision to cast to, “half” or “full”

Returns:

Output dtype

Return type:

np.dtype

Raises:

ValueError – If precision is not “half” or “full”

ftag.hdf5.h5utils.join_structured_arrays(arrays: list)#

Join a list of structured numpy arrays.

See numpy/numpy#7811

Parameters:

arrays (list) – List of structured numpy arrays to join

Returns:

A merged structured array

Return type:

np.array

ftag.hdf5.h5utils.structured_from_dict(d: dict[str, numpy.ndarray]) numpy.ndarray#

Convert a dict to a structured array.

Parameters:

d (dict[str, np.ndarray]) – Input dict of numpy arrays

Returns:

Structured array

Return type:

np.ndarray