Dummy Data#
To test/demonstrate the puma API, we just want to use dummy data.
There are three methods in puma to generate dummy data:
The first function returns directly a pandas.DataFrame including the following columns:
HadronConeExclTruthLabelIDrnnip_purnnip_pcrnnip_pbdips_pudips_pcdips_pb
which can be used in the following manner:
from puma.utils import get_dummy_2_taggers
df = get_dummy_2_taggers()
The second function is get_dummy_multiclass_scores which returns an output array
with shape (size, 3), which is the usual output of our multi-class classifiers like
DIPS, and the labels conform with the HadronConeExclTruthLabelID variable.
from puma.utils import get_dummy_multiclass_scores
output, labels = get_dummy_multiclass_scores()
Finally, the get_dummy_tagger_aux function returns a h5 file with both jet and
track collections (needed for aux task plots). These include the following columns
(aux task information is generated for both vertexing and track origin classification):
jets:
HadronConeExclTruthLabelIDGN2_puGN2_pcGN2_pbptetan_truth_promptLepton
tracks:
ftagTruthVertexIndexGN2_VertexIndexftagTruthOriginLabelGN2_TrackOrigin
which can be used in the following manner:
from puma.utils import get_dummy_tagger_aux
df = get_dummy_tagger_aux()