ftag.vds#
Functions#
|
|
|
Concatenate group from multiple files into a single VirtualDataset. |
|
Return list of filenames that match REGEX pattern inside regex_path. |
|
Turn a list of basenames into full paths; dive into sub-dirs if needed. |
|
Reduce the arrays in the counts dataset for one file to a scalar via summation. |
|
Check which subgroups are available for the bookkeeper. |
|
Aggregate the cutBookkeeper in the input files. |
|
Create the virtual dataset file for the given inputs. |
|
Module Contents#
- ftag.vds.parse_args(args=None)#
- ftag.vds.get_virtual_layout(fnames: list[str], group: str) h5py.VirtualLayout #
Concatenate group from multiple files into a single VirtualDataset.
- Parameters:
fnames (list[str]) – List with the file names
group (str) – Name of the group that is concatenated
- Returns:
Virtual layout of the new virtual dataset
- Return type:
h5py.VirtualLayout
- ftag.vds.glob_re(pattern: str | None, regex_path: str | None) list[str] | None #
Return list of filenames that match REGEX pattern inside regex_path.
- Parameters:
pattern (str) – Pattern for the input files
regex_path (str) – Regex path for the input files
- Returns:
List of the file basenames that matched the regex pattern
- Return type:
list[str]
- ftag.vds.regex_files_from_dir(reg_matched_fnames: list[str] | None, regex_path: str | None) list[str] | None #
Turn a list of basenames into full paths; dive into sub-dirs if needed.
- Parameters:
reg_matched_fnames (list[str]) – List of the regex matched file names
regex_path (str) – Regex path for the input files
- Returns:
List of file paths (as strings) that matched the regex and any subsequent globbing inside matched directories.
- Return type:
list[str]
- ftag.vds.sum_counts_once(counts: numpy.ndarray) numpy.ndarray #
Reduce the arrays in the counts dataset for one file to a scalar via summation.
- Parameters:
counts (np.ndarray) – Array from the h5py dataset (counts) from the cutBookkeeper groups
- Returns:
Array with the summed variables for the file
- Return type:
np.ndarray
- ftag.vds.check_subgroups(fnames: list[str], group_name: str = 'cutBookkeeper') list[str] #
Check which subgroups are available for the bookkeeper.
Find the intersection of sub-group names that have a ‘counts’ dataset in every input file. (Using the intersection makes the script robust even if a few files are missing a variation.)
- Parameters:
fnames (list[str]) – List of the input files
group_name (str, optional) – Group name in the h5 files of the bookkeeper, by default “cutBookkeeper”
- Returns:
Returns the files with common sub-groups
- Return type:
set[str]
- Raises:
KeyError – When a file does not have a bookkeeper
ValueError – When no common bookkeeper sub-groups were found
- ftag.vds.aggregate_cutbookkeeper(fnames: list[str], group_name: str = 'cutBookkeeper') dict[str, numpy.ndarray] | None #
Aggregate the cutBookkeeper in the input files.
For every input file: For every sub-group (nominal, sysUp, sysDown, …): 1. Sum the 4-entry record array inside each file into 1 record 1. Add those records from all files together into grand total Returns a dict {subgroup_name: scalar-record-array}
- Parameters:
fnames (list[str]) – List of the input files
- Returns:
Dict with the accumulated cutBookkeeper groups. If the cut bookkeeper is not in the files, return None.
- Return type:
dict[str, np.ndarray] | None
- ftag.vds.create_virtual_file(pattern: pathlib.Path | str, out_fname: pathlib.Path | str | None = None, use_regex: bool = False, regex_path: str | None = None, overwrite: bool = False, bookkeeper_name: str = 'cutBookkeeper') pathlib.Path #
Create the virtual dataset file for the given inputs.
- Parameters:
pattern (Path | str) – Pattern of the input files used. Wildcard is supported
out_fname (Path | str | None, optional) – Output path to which the virtual dataset file is written. By default None
use_regex (bool, optional) – If you want to use regex instead of glob, by default False
regex_path (str | None, optional) – Regex logic used to define the input files, by default None
overwrite (bool, optional) – Decide, if an existing output file is overwritten, by default False
bookkeeper_name (str, optional) – Name of the cut bookkeeper in the h5 files.
- Returns:
Path object of the path to which the output file is written
- Return type:
Path
- Raises:
FileNotFoundError – If not input files were found for the given pattern
ValueError – If no output file is given and the input comes from multiple directories
- ftag.vds.main(args=None) None #