ftag.find_metadata#
Attributes#
Functions#
|
Validate the scheme of a given URL, ensuring it is http or https. |
|
Download the PMG xsecDB files from CERN if they are not present locally. |
|
Extract the BigPanDA Task ID (8-digit) from an HDF5 filename. |
|
Fetch task information from BigPanDA for a given Task ID. |
|
Extract the MC container name (e.g., mc16_13TeV.<something>) from a task JSON. |
|
Extract DSID and etag from a task name string. |
|
Derive campaign (mc15/mc16/etc.) from a task or container name. |
|
Extract DSID, etag, and campaign name from a container string. |
|
Look up cross-section metadata in the PMG xsecDB. |
|
Write metadata values into an HDF5 file under metadata/<DSID>. |
|
Use fallback metadata from YAML if automatic lookup fails. |
|
Parse CLI arguments and load YAML metadata if provided. |
|
Process a single .h5 file by attempting BigPanDA lookup, then fallback to YAML. |
|
Entry point: parse arguments, download xsecDBs, process each file, and clean up. |
Module Contents#
- ftag.find_metadata.XSECDB_MAP: dict[str, str]#
- ftag.find_metadata.XSECDB_URL_BASE: str = 'https://atlas-groupdata.web.cern.ch/atlas-groupdata/dev/PMGTools/'#
- ftag.find_metadata.validate_url_scheme(url: str) urllib.parse.ParseResult #
Validate the scheme of a given URL, ensuring it is http or https.
- Parameters:
url (str) – URL string to validate.
- Returns:
Parsed URL object.
- Return type:
ParseResult
- Raises:
ValueError – If the URL scheme is not supported.
- ftag.find_metadata.download_xsecdb_files() None #
Download the PMG xsecDB files from CERN if they are not present locally.
- ftag.find_metadata.extract_taskid_from_filename(h5_path: pathlib.Path) str | None #
Extract the BigPanDA Task ID (8-digit) from an HDF5 filename.
- Parameters:
h5_path (Path) – Path object pointing to the .h5 file.
- Returns:
The Task ID as a string if found, otherwise None.
- Return type:
str | None
- ftag.find_metadata.fetch_taskinfo_from_bigpanda(taskid: str) dict[str, Any] | None #
Fetch task information from BigPanDA for a given Task ID.
- Parameters:
taskid (str) – BigPanDA task ID.
- Returns:
Task info as a dictionary if found, otherwise None.
- Return type:
dict[str, Any] | None
- ftag.find_metadata.extract_mc_container_from_json(data: dict[str, Any]) str | None #
Extract the MC container name (e.g., mc16_13TeV.<something>) from a task JSON.
- Parameters:
data (dict[str, Any]) – Task info dictionary from BigPanDA.
- Returns:
The container string if found, otherwise None.
- Return type:
str | None
- ftag.find_metadata.parse_line_from_taskname(taskname: str) tuple[int | None, str | None] #
Extract DSID and etag from a task name string.
- Parameters:
taskname (str) – Full task name.
- Returns:
A tuple of (DSID as int, etag as string), or (None, None) if not found.
- Return type:
tuple[int | None, str | None]
- ftag.find_metadata.parse_campaign_from_taskname(taskname: str) str | None #
Derive campaign (mc15/mc16/etc.) from a task or container name.
- Parameters:
taskname (str) – The name string.
- Returns:
Campaign string, or None if not found.
- Return type:
str | None
- ftag.find_metadata.extract_info_from_container(container: str) tuple[int, str, str] | None #
Extract DSID, etag, and campaign name from a container string.
- Parameters:
container (str) – The MC container string.
- Returns:
A tuple of (DSID, etag, campaign), or None if parsing fails.
- Return type:
tuple[int, str, str] | None
- ftag.find_metadata.query_xsecdb(campaign: str, dsid: int, etag: str) dict[str, Any] | None #
Look up cross-section metadata in the PMG xsecDB.
- Parameters:
campaign (str) – Campaign name (e.g., mc16).
dsid (int) – Dataset ID.
etag (str) – Event tag.
- Returns:
Dictionary with cross_section_pb, genFiltEff, kfactor, and etag if found, otherwise None.
- Return type:
dict[str, Any] | None
- ftag.find_metadata.write_metadata_to_h5(h5_filename: str, dsid: int, metadata_dict: dict[str, Any]) None #
Write metadata values into an HDF5 file under metadata/<DSID>.
- Parameters:
h5_filename (str) – Target HDF5 file.
dsid (int) – Dataset ID to write metadata for.
metadata_dict (dict[str, Any]) – Dictionary of metadata to inject.
- ftag.find_metadata.handle_yaml_fallback(h5_path: pathlib.Path, yaml_data: dict[str, Any]) None #
Use fallback metadata from YAML if automatic lookup fails.
- Parameters:
h5_path (Path) – Path to the HDF5 file.
yaml_data (dict[str, Any]) – Metadata dictionary loaded from YAML.
- Raises:
ValueError – If YAML is invalid, empty, or missing required fields.
- ftag.find_metadata.parse_args_and_yaml() tuple[list[str], dict[str, Any]] #
Parse CLI arguments and load YAML metadata if provided.
- Returns:
A tuple of (list of HDF5 file paths, YAML metadata dict).
- Return type:
tuple[list[str], dict[str, Any]]
- ftag.find_metadata.process_single_file(path: pathlib.Path, yaml_data: dict[str, Any]) None #
Process a single .h5 file by attempting BigPanDA lookup, then fallback to YAML.
- Parameters:
path (Path) – Path to the HDF5 file.
yaml_data (dict[str, Any]) – Optional fallback metadata.
- ftag.find_metadata.main() None #
Entry point: parse arguments, download xsecDBs, process each file, and clean up.