root icon indicating copy to clipboard operation
root copied to clipboard

[Python] TH1 equality operator pythonization not appropriate for ROOT histograms

Open guitargeek opened this issue 6 months ago • 3 comments

Description

The UHI implementation of the equality operator only compares the histogram type, binning and counts.

ROOT histograms have way more properties, like drawing options, and even more crucially they store the sum-of-weights-squared, where different values make a big difference in the statistical interpretation (the errorbars are directly inferred from it).

Therefore, I think the currently-implemented equality check is oversimplifying things and therefore potentially dangerous.

Furthermore, the new equality operator marks a significant behavior change that is undocumented in the release notes, since previously it was doing a comparison by pointer. This should be documented in the release notes if we decide to keep the UHI implementation.

We have examples in the ROOT code base itself, where the equality operator wrongly assumed that it checks for pointer equality:

  • https://github.com/root-project/root/pull/19015
  • https://github.com/root-project/root/issues/19035

It's likely that there is user code that there is user code that makes the same assumption, which would be broken in 6.36 with the UHI.

Reproducer

import ROOT

hist1 = ROOT.TH1D("hist1", "hist1", 1, -1., 1.)
hist1.Fill(0., 2.)
hist1.Fill(0., 2.)
print(hist1.GetSumw2()[1])

hist2 = ROOT.TH1D("hist2", "hist2", 1, -1., 1.)
hist2.Fill(0., 4.)
print(hist2.GetSumw2()[1])

print(hist1 == hist2) # ROOT will tell you the histograms are identical, while they have different sumw2!

The output will be:

8.0
16.0
True

guitargeek avatar Jun 13 '25 19:06 guitargeek

We have examples in the ROOT code base itself, where the equality operator wrongly assumed that it checks for pointer equality: It's likely that there is user code that there is user code that makes the same assumption

Let me add that in my opinion that logic was flawed in the first place and I actually consider the fact that the new operator equality breaks that assumption a happy side-effect.

ROOT will tell you the histograms are identical, while they have different sumw2!

This is indeed not good and probably we should extend the equality operator.

In general, there is no clear definition of what makes two histograms equal, since that is implementation-dependent. The UHI protocol does not include the __eq__ operator (so far), so I believe we should first decide what makes two ROOT histograms equal, and implement our operator accordingly (note this has never been done before).

vepadulano avatar Jun 15 '25 19:06 vepadulano

Ah that's good to know that this doesn't come from UHI! Good, than we have the flexibility to change the implementation of the equality operator.

guitargeek avatar Jun 16 '25 05:06 guitargeek

Ah that's good to know that this doesn't come from UHI!

It is indeed not part of the protocol, but we need it to integrate the UHI test suite

siliataider avatar Jun 16 '25 09:06 siliataider