Dummy Data#
To test/demonstrate the puma
API, we just want to use dummy data.
There are three methods in puma to generate dummy data:
The first function returns directly a pandas.DataFrame
including the following columns:
HadronConeExclTruthLabelID
rnnip_pu
rnnip_pc
rnnip_pb
dips_pu
dips_pc
dips_pb
which can be used in the following manner:
from puma.utils import get_dummy_2_taggers
df = get_dummy_2_taggers()
The second function is get_dummy_multiclass_scores
which returns an output array
with shape (size, 3)
, which is the usual output of our multi-class classifiers like
DIPS, and the labels conform with the HadronConeExclTruthLabelID
variable.
from puma.utils import get_dummy_multiclass_scores
output, labels = get_dummy_multiclass_scores()
Finally, the get_dummy_tagger_aux
function returns a h5 file with both jet and
track collections (needed for aux task plots). These include the following columns
(aux task information is generated for both vertexing and track origin classification):
jets:
HadronConeExclTruthLabelID
GN2_pu
GN2_pc
GN2_pb
pt
eta
n_truth_promptLepton
tracks:
ftagTruthVertexIndex
GN2_VertexIndex
ftagTruthOriginLabel
GN2_TrackOrigin
which can be used in the following manner:
from puma.utils import get_dummy_tagger_aux
df = get_dummy_tagger_aux()