Display does not respect parameters if another operation is booked before printing
Describe the bug
Apparently calling Display with some extra parameter like nRows is not respected if before calling Print one adds any other operation to the graph.
To Reproduce
#include <ROOT/RDataFrame.hxx>
#include <TString.h>
#include <string>
#include <vector>
#include <memory>
void write_tree()
{
int x{};
TString s{};
TFile file{"dataset.root", "recreate"};
TTree tree{"events", "events"};
tree.Branch("nJets", &x, "nJets/I");
tree.Branch("systName", &s);
std::vector<std::string> values{"nominal", "up", "down"};
for (int i = 0; i < 3; i++)
{
x = i + 1;
s.Replace(0, s.Length(), values[i]);
tree.Fill();
}
tree.Write();
}
void read_tree()
{
TFile file{"dataset.root", "read"};
std::unique_ptr<TTree> tree{file.Get<TTree>("events")};
tree->Scan("*");
}
void analyze()
{
ROOT::RDataFrame df{"events", "dataset.root"};
// Book display
auto disp = df.Display("systName", 1);
// Do anything else
df.Count().GetValue();
// The display doesn't respect the original parameters
// Prints all 3 rows whereas it should print only one
disp->Print();
}
int main()
{
write_tree();
read_tree();
analyze();
}
Setup
Additional context
Fedora 36 ROOT 6.26/06 from conda
It turns out this is due to an integer underflow that is triggered by the machinery of DisplayHelper:
https://github.com/root-project/root/blob/3160daafc008d8080cb9b3c602f4134b521ca8ad/tree/dataframe/inc/ROOT/RDF/ActionHelpers.hxx#L1334-L1337
The intended workflow is:
- Add the next row to be displayed to the
RDisplayobject. TheAddRowmethod decreases an internal counter of how many entries are left to be displayed at https://github.com/root-project/root/blob/3160daafc008d8080cb9b3c602f4134b521ca8ad/tree/dataframe/inc/ROOT/RDF/RDisplay.hxx#L227 - Check whether there are no more entries to be displayed (
!fDisplayerHelper->HasNext()). - If so, signal the previous node that this node has finished its job through the
StopProcessingmethod
There are a bunch of flaws in this workflow. Uncoditionally calling AddRow may trigger the integer underflow by calling fEntries-- when fEntries==0. This can be seen quite simply with the following example
root [0] ROOT::RDataFrame d{1};
root [1] auto dd = d.Define("b1", [] { return 42; }).Display<int>({"b1"}, 0);
root [2] dd->Print()
+-----+----+
| Row | b1 |
+-----+----+
| 0 | 42 |
+-----+----+
The row is printed even though the user asked for 0 entries to be displayed.
The other problem, more subtle and the actual culprit of the reproducer above, is when there is more than just the Display operation in the computation graph. DisplayHelper::Exec is called once per entry to be processed, as this is the normal working condition in RLoopManager::Run. When there is only the Display operation, the moment there are no more entries to be displayed, DisplayHelper will tell RLoopManager that it has finished, thus triggering an early stop of the execution (e.g. if a tree has 100 entries but we only want to display 5). When there is more than one operation, i.e. more than one child of the RLoopManager, the call to StopProcessing coming from DisplayHelper::Exec is not enough to stop the execution. Taking as an example the repro above, there is one Display and one Count operation, thus 2 children, thus 2 calls to StopProcessing must be issued in order to stop the execution.
From the point of view of DisplayHelper::Exec, the moment fEntries reaches zero it will call StopProcessing. But RLoopManager will still process the next entry because of the other child present. This means we are now calling again DisplayHelper::Exec, thus also RDisplay::AddRow, thus calling fEntries--, thus triggering an integer underflow. The final result is that the whole dataset will be displayed even though the user asked for less.
Hi @vepadulano,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot:
Hi @vepadulano,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot:
Hi @vepadulano,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot:
Hi @vepadulano,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot: