ftag.find_metadata#

Attributes#

`XSECDB_MAP`
`XSECDB_URL_BASE`

Functions#

`validate_url_scheme`(→ urllib.parse.ParseResult)	Validate the scheme of a given URL, ensuring it is http or https.
`download_xsecdb_files`(→ None)	Download the PMG xsecDB files from CERN if they are not present locally.
`extract_taskid_from_filename`(→ str \| None)	Extract the BigPanDA Task ID (8-digit) from an HDF5 filename.
`fetch_taskinfo_from_bigpanda`(→ dict[str, Any] \| None)	Fetch task information from BigPanDA for a given Task ID.
`extract_mc_container_from_json`(→ str \| None)	Extract the MC container name (e.g., mc16_13TeV.<something>) from a task JSON.
`parse_line_from_taskname`(→ tuple[int \| None, str \| None])	Extract DSID and etag from a task name string.
`parse_campaign_from_taskname`(→ str \| None)	Derive campaign (mc15/mc16/etc.) from a task or container name.
`extract_info_from_container`(→ tuple[int, str, str] \| None)	Extract DSID, etag, and campaign name from a container string.
`query_xsecdb`(→ dict[str, Any] \| None)	Look up cross-section metadata in the PMG xsecDB.
`write_metadata_to_h5`(→ None)	Write metadata values into an HDF5 file under metadata/<DSID>.
`handle_yaml_fallback`(→ None)	Use fallback metadata from YAML if automatic lookup fails.
`parse_args_and_yaml`(→ tuple[list[str], dict[str, Any]])	Parse CLI arguments and load YAML metadata if provided.
`process_single_file`(→ None)	Process a single .h5 file by attempting BigPanDA lookup, then fallback to YAML.
`main`(→ None)	Entry point: parse arguments, download xsecDBs, process each file, and clean up.

Module Contents#

ftag.find_metadata.XSECDB_MAP: dict[str, str]#

ftag.find_metadata.XSECDB_URL_BASE: str = 'https://atlas-groupdata.web.cern.ch/atlas-groupdata/dev/PMGTools/'#

ftag.find_metadata.validate_url_scheme(url: str) → urllib.parse.ParseResult#

Validate the scheme of a given URL, ensuring it is http or https.

Parameters:: url (str) – URL string to validate.
Returns:: Parsed URL object.
Return type:: ParseResult
Raises:: ValueError – If the URL scheme is not supported.

ftag.find_metadata.download_xsecdb_files() → None#: Download the PMG xsecDB files from CERN if they are not present locally.

ftag.find_metadata.extract_taskid_from_filename(h5_path: pathlib.Path) → str | None#

Extract the BigPanDA Task ID (8-digit) from an HDF5 filename.

Parameters:: h5_path (Path) – Path object pointing to the .h5 file.
Returns:: The Task ID as a string if found, otherwise None.
Return type:: str | None

ftag.find_metadata.fetch_taskinfo_from_bigpanda(taskid: str) → dict[str, Any] | None#

Fetch task information from BigPanDA for a given Task ID.

Parameters:: taskid (str) – BigPanDA task ID.
Returns:: Task info as a dictionary if found, otherwise None.
Return type:: dict[str, Any] | None

ftag.find_metadata.extract_mc_container_from_json(data: dict[str, Any]) → str | None#

Extract the MC container name (e.g., mc16_13TeV.<something>) from a task JSON.

Parameters:: data (dict[str, Any]) – Task info dictionary from BigPanDA.
Returns:: The container string if found, otherwise None.
Return type:: str | None

ftag.find_metadata.parse_line_from_taskname(taskname: str) → tuple[int | None, str | None]#

Extract DSID and etag from a task name string.

Parameters:: taskname (str) – Full task name.
Returns:: A tuple of (DSID as int, etag as string), or (None, None) if not found.
Return type:: tuple[int | None, str | None]

ftag.find_metadata.parse_campaign_from_taskname(taskname: str) → str | None#

Derive campaign (mc15/mc16/etc.) from a task or container name.

Parameters:: taskname (str) – The name string.
Returns:: Campaign string, or None if not found.
Return type:: str | None

ftag.find_metadata.extract_info_from_container(container: str) → tuple[int, str, str] | None#

Extract DSID, etag, and campaign name from a container string.

Parameters:: container (str) – The MC container string.
Returns:: A tuple of (DSID, etag, campaign), or None if parsing fails.
Return type:: tuple[int, str, str] | None

ftag.find_metadata.query_xsecdb(campaign: str, dsid: int, etag: str) → dict[str, Any] | None#

Look up cross-section metadata in the PMG xsecDB.

Parameters:

campaign (str) – Campaign name (e.g., mc16).
dsid (int) – Dataset ID.
etag (str) – Event tag.

Returns:

Dictionary with cross_section_pb, genFiltEff, kfactor, and etag if found, otherwise None.

Return type:

dict[str, Any] | None

ftag.find_metadata.write_metadata_to_h5(h5_filename: str, dsid: int, metadata_dict: dict[str, Any]) → None#

Write metadata values into an HDF5 file under metadata/<DSID>.

Parameters:

h5_filename (str) – Target HDF5 file.
dsid (int) – Dataset ID to write metadata for.
metadata_dict (dict[str, Any]) – Dictionary of metadata to inject.

ftag.find_metadata.handle_yaml_fallback(h5_path: pathlib.Path, yaml_data: dict[str, Any]) → None#

Use fallback metadata from YAML if automatic lookup fails.

Parameters:

h5_path (Path) – Path to the HDF5 file.
yaml_data (dict[str, Any]) – Metadata dictionary loaded from YAML.

Raises:

ValueError – If YAML is invalid, empty, or missing required fields.

ftag.find_metadata.parse_args_and_yaml() → tuple[list[str], dict[str, Any]]#

Parse CLI arguments and load YAML metadata if provided.

Returns:: A tuple of (list of HDF5 file paths, YAML metadata dict).
Return type:: tuple[list[str], dict[str, Any]]

ftag.find_metadata.process_single_file(path: pathlib.Path, yaml_data: dict[str, Any]) → None#

Process a single .h5 file by attempting BigPanDA lookup, then fallback to YAML.

Parameters:

path (Path) – Path to the HDF5 file.
yaml_data (dict[str, Any]) – Optional fallback metadata.

ftag.find_metadata.main() → None#: Entry point: parse arguments, download xsecDBs, process each file, and clean up.

ftag.find_metadata#

Attributes#

Functions#

Module Contents#

This Page