root
root copied to clipboard
Random sample picker for RDataFrame
Feature description
Hi,
We need an alternative to Range in RDataFrame, that picks subsets of entries randomly. This is needed because when using real data, we might need a representative subsample, doing:
rdf=rdf.Range(1000)
would bias our results to look for only the first data in the year.
Alternatives considered
I implemented this myself, check this this and this and take whatever you need from there. Although you probably will implement it yourselves, given that ROOT is not written in python but c++
Cheers
Additional context
No response