root icon indicating copy to clipboard operation
root copied to clipboard

Random sample picker for RDataFrame

Open acampove opened this issue 6 months ago • 0 comments

Feature description

Hi,

We need an alternative to Range in RDataFrame, that picks subsets of entries randomly. This is needed because when using real data, we might need a representative subsample, doing:

rdf=rdf.Range(1000)

would bias our results to look for only the first data in the year.

Alternatives considered

I implemented this myself, check this this and this and take whatever you need from there. Although you probably will implement it yourselves, given that ROOT is not written in python but c++

Cheers

Additional context

No response

acampove avatar Jun 08 '25 12:06 acampove