ftag.vds#

Functions#

`parse_args`([args])
`get_virtual_layout`(→ h5py.VirtualLayout)	Concatenate group from multiple files into a single VirtualDataset.
`glob_re`(→ list[str] \| None)	Return list of filenames that match REGEX pattern inside regex_path.
`regex_files_from_dir`(→ list[str] \| None)	Turn a list of basenames into full paths; dive into sub-dirs if needed.
`sum_counts_once`(→ numpy.ndarray)	Reduce the arrays in the counts dataset for one file to a scalar via summation.
`check_subgroups`(→ list[str])	Check which subgroups are available for the bookkeeper.
`aggregate_cutbookkeeper`(→ dict[str, numpy.ndarray] \| None)	Aggregate the cutBookkeeper in the input files.
`create_virtual_file`(→ pathlib.Path)	Create the virtual dataset file for the given inputs.
`main`(→ None)

Module Contents#

ftag.vds.parse_args(args=None)#

ftag.vds.get_virtual_layout(fnames: list[str], group: str) → h5py.VirtualLayout#

Concatenate group from multiple files into a single VirtualDataset.

Parameters:

fnames (list[str]) – List with the file names
group (str) – Name of the group that is concatenated

Returns:

Virtual layout of the new virtual dataset

Return type:

h5py.VirtualLayout

ftag.vds.glob_re(pattern: str | None, regex_path: str | None) → list[str] | None#

Return list of filenames that match REGEX pattern inside regex_path.

Parameters:

pattern (str | None) – Pattern for the input files
regex_path (str | None) – Regex path for the input files

Returns:

List of the file basenames that matched the regex pattern

Return type:

list[str] | None

ftag.vds.regex_files_from_dir(reg_matched_fnames: list[str] | None, regex_path: str | None) → list[str] | None#

Turn a list of basenames into full paths; dive into sub-dirs if needed.

Parameters:

reg_matched_fnames (list[str] | None) – List of the regex matched file names
regex_path (str | None) – Regex path for the input files

Returns:

List of file paths (as strings) that matched the regex and any subsequent globbing inside matched directories.

Return type:

list[str] | None

ftag.vds.sum_counts_once(counts: numpy.ndarray) → numpy.ndarray#

Reduce the arrays in the counts dataset for one file to a scalar via summation.

Parameters:: counts (np.ndarray) – Array from the h5py dataset (counts) from the cutBookkeeper groups
Returns:: Array with the summed variables for the file
Return type:: np.ndarray

ftag.vds.check_subgroups(fnames: list[str], group_name: str = 'cutBookkeeper') → list[str]#

Check which subgroups are available for the bookkeeper.

Find the intersection of sub-group names that have a ‘counts’ dataset in every input file. (Using the intersection makes the script robust even if a few files are missing a variation.)

Parameters:

fnames (list[str]) – List of the input files
group_name (str, optional) – Group name in the h5 files of the bookkeeper, by default “cutBookkeeper”

Returns:

Returns the files with common sub-groups

Return type:

set[str]

Raises:

KeyError – When a file does not have a bookkeeper
ValueError – When no common bookkeeper sub-groups were found

ftag.vds.aggregate_cutbookkeeper(fnames: list[str], group_name: str = 'cutBookkeeper') → dict[str, numpy.ndarray] | None#

Aggregate the cutBookkeeper in the input files.

For every input file: For every sub-group (nominal, sysUp, sysDown, …): 1. Sum the 4-entry record array inside each file into 1 record 1. Add those records from all files together into grand total Returns a dict {subgroup_name: scalar-record-array}

Parameters:

fnames (list[str]) – List of the input files
group_name (str, optional) – Group name of the cutBookkeeper. By default “cutBookkeeper”

Returns:

Dict with the accumulated cutBookkeeper groups. If the cut bookkeeper is not in the files, return None.

Return type:

dict[str, np.ndarray] | None

ftag.vds.create_virtual_file(pattern: pathlib.Path | str, out_fname: pathlib.Path | str | None = None, use_regex: bool = False, regex_path: str | None = None, overwrite: bool = False, bookkeeper_name: str = 'cutBookkeeper') → pathlib.Path#

Create the virtual dataset file for the given inputs.

Parameters:

pattern (Path | str) – Pattern of the input files used. Wildcard is supported
out_fname (Path | str | None, optional) – Output path to which the virtual dataset file is written. By default None
use_regex (bool, optional) – If you want to use regex instead of glob, by default False
regex_path (str | None, optional) – Regex logic used to define the input files, by default None
overwrite (bool, optional) – Decide, if an existing output file is overwritten, by default False
bookkeeper_name (str, optional) – Name of the cut bookkeeper in the h5 files.

Returns:

Path object of the path to which the output file is written

Return type:

Path

Raises:

FileNotFoundError – If not input files were found for the given pattern
ValueError – If no output file is given and the input comes from multiple directories

ftag.vds.main(args=None) → None#

ftag.vds#

Functions#

Module Contents#

This Page