root icon indicating copy to clipboard operation
root copied to clipboard

Serialisation (and therefore I/O) issues with TF1 and TFitResultPtr

Open dpiparo opened this issue 1 year ago • 4 comments

Original title: "TF1 and TFitResultPtr do not serialise correctly pickle, and this is an issue with Python multiprocessing"

Check duplicate issues.

  • [X] Checked for duplicates

Description

TF1 and TFitResultPtr do not serialise correctly with pickle. This causes issues with multiprocessing in python as well as distributed execution, e.g. with DistRDF.

Reproducer

One can see with the reproducer below that:

  • The fit succeeds and the result pointer is sane if nothing is pickled and depickled
  • The fit fails if the function is pickled and depickled
  • The fit result pointer is not sane any more if pickled and then depickled
import ROOT
import pickle

def SerialiseDeserialise(obj):
    return pickle.loads(pickle.dumps(obj))

h = ROOT.TH1F("myHist", "myTitle", 64, -4, 4)
h.FillRandom("gaus")
f1 = ROOT.TF1("f1", "gaus")
f1_d = SerialiseDeserialise(f1)

res = h.Fit(f1, "S")
print ("Status is ", res.Status())

# Check fit with de-serialised TF1
res = h.Fit(f1_d, "S")
print ("Status is ", res.Status())

# Check de-serialised result ptr
res_d = SerialiseDeserialise(res)
print ("Status is ", res_d.Status())

ROOT version

master (I suspect all)

Installation method

from sources

Operating system

MacOS

Additional context

No response

dpiparo avatar Aug 07 '24 13:08 dpiparo

One can be even more precise. The issue is in the I/O, and pickle is just picking it up:

import ROOT
import pickle

index = 0
files = []

def SerialiseDeserialise(obj):
    global index, files
    fname = "tmp_{index}.root"
    f = ROOT.TFile(fname, "RECREATE")
    obj.Write()
    f.Close()
    f = ROOT.TFile(fname)
    files.append(f)
    index += 1
    return f.Get(obj.GetName())

h = ROOT.TH1F("myHist", "myTitle", 64, -4, 4)
h.FillRandom("gaus")
f1 = ROOT.TF1("f1", "gaus")
f1_d = SerialiseDeserialise(f1)

res = h.Fit(f1, "S")
print ("Status is ", res.Status())

# Check fit with de-serialised TF1
res = h.Fit(f1_d, "S")
print ("Status is ", res.Status())

# Check de-serialised result ptr
res_d = SerialiseDeserialise(res)
print ("Status is ", res_d.Status())

dpiparo avatar Aug 07 '24 13:08 dpiparo

A new problem was detected: https://root-forum.cern.ch/t/multiprocessing-fits-within-pyroot/60297/10. It can be easily reproduced by defining a TF1 as follows:


def fit_thermal(x, par):
    dndy = par[0]
    T    = par[1]
    m0   = par[2]
    mt   = math.sqrt(x[0]*x[0] + m0*m0)
    val  = (dndy/(T*(m0 + T))) * math.exp(-(mt - m0)/T)
    return val

thermal = TF1Wrapper("thermal", fit_thermal, 0, 2, 3)

dpiparo avatar Aug 08 '24 08:08 dpiparo

This is a known issue. Since the beginning of ROOT, TF1 based on code (C++ or Python) cannot be cloned or saved to disk. See for example https://root.cern.ch/doc/master/classTF1.html#a290e2d1c06125aca17f20a1c817c3d4e. Only a static snapshot of the function is saved that can be used for plotting but not fitting. The function needs to be re-created using the function expression

lmoneta avatar Aug 08 '24 09:08 lmoneta

Fair enough (we have potential to recover python and jitted functions then!). The comments above are still valid though.

dpiparo avatar Aug 08 '24 13:08 dpiparo