uproot5 icon indicating copy to clipboard operation
uproot5 copied to clipboard

TTime needs a custom/manual Model class

Open jpivarski opened this issue 1 year ago • 3 comments

As discussed in https://root-forum.cern.ch/t/ttime-saved-to-a-ttree-in-root5-and-root6/54031 and #859, the TTime class breaks pattern and includes headers because

the dictionary generated is requesting the ‘very’ old I/O for the class TTime.

In my reading of this, the only way to get it right is to add a custom/manual Model class in the models directory and have it read headers unconditionally (like some other old classes, TList, etc.).

Before I forget where I put the file, here is the baseline Model that needs to be copied and modified (to include @num_bytes and @instance_version regardless of the value of header in strided_interpretation). Also, the AwkwardForth code should start with a 6 stream skip.

class Model_TTime_v2(uproot.model.VersionedModel):
    def read_members(self, chunk, cursor, context, file):
        import uproot._awkward_forth
        if self.is_memberwise:
            raise NotImplementedError(
                f"memberwise serialization of {type(self).__name__}\nin file {self.file.file_path}"
            )

        forth_stash = uproot._awkward_forth.forth_stash(context)
        if forth_stash is not None:
            forth_obj = forth_stash.get_gen_obj()
            content = {}
        if forth_stash is not None:
            key = forth_obj.get_keys(1)
            form_key = f"node{key}-data"
            forth_stash.add_to_header(f"output node{key}-data int64\n")
            content['fMilliSec'] = { "class": "NumpyArray", "primitive": "int64", "inner_shape": [], "parameters": {}, "form_key": f"node{key}"}
            forth_stash.add_to_pre(f"stream !q-> node{key}-data\n")
            if forth_obj.should_add_form():
                forth_obj.add_form_key(form_key)
        self._members['fMilliSec'] = cursor.field(chunk, self._format0, context)
        if forth_stash is not None:
            if forth_obj.should_add_form():
                forth_obj.add_form({'class': 'RecordArray', 'contents': content, 'parameters': {'__record__': 'TTime'}}, len(content))
            temp = forth_obj.add_node('dynamic', forth_stash.get_attrs(), "i64", 0, None)

    def read_member_n(self, chunk, cursor, context, file, member_index):
        if member_index == 0:
            self._members['fMilliSec'] = cursor.field(chunk, self._format_memberwise0, context)

    @classmethod
    def strided_interpretation(cls, file, header=False, tobject_header=True, breadcrumbs=(), original=None):
        if cls in breadcrumbs:
            raise uproot.interpretation.objects.CannotBeStrided('classes that can contain members of the same type cannot be strided because the depth of instances is unbounded')
        breadcrumbs = breadcrumbs + (cls,)
        members = []
        if header:
            members.append(('@num_bytes', numpy.dtype('>u4')))
            members.append(('@instance_version', numpy.dtype('>u2')))
        members.append(('fMilliSec', numpy.dtype('>i8')))
        return uproot.interpretation.objects.AsStridedObjects(cls, members, original=original)

    @classmethod
    def awkward_form(cls, file, context):
        from awkward.forms import NumpyForm, ListOffsetForm, RegularForm, RecordForm
        if cls in context['breadcrumbs']:
            raise uproot.interpretation.objects.CannotBeAwkward('classes that can contain members of the same type cannot be Awkward Arrays because the depth of instances is unbounded')
        context['breadcrumbs'] = context['breadcrumbs'] + (cls,)
        contents = {}
        if context['header']:
            contents['@num_bytes'] = uproot._util.awkward_form(numpy.dtype('u4'), file, context)
            contents['@instance_version'] = uproot._util.awkward_form(numpy.dtype('u2'), file, context)
        contents['fMilliSec'] = uproot._util.awkward_form(numpy.dtype('>i8'), file, context)
        return RecordForm(list(contents.values()), list(contents.keys()), parameters={'__record__': 'TTime' })

    _format0 = struct.Struct('>q')
    _format_memberwise0 = struct.Struct('>q')
    base_names_versions=[]
    member_names = ['fMilliSec']
    class_flags = {}

And then these files, root5_6_examples.zip, should be used to make a test.

jpivarski avatar Mar 15 '23 18:03 jpivarski

Not so fast: maybe we just need a list of classes that have this exceptional header behavior? https://github.com/scikit-hep/uproot5/discussions/859#discussioncomment-5325233

It seems that this is not one of those examples... or maybe the reason this is a recurring problem is because there are a lot of "old" classes? If so, then maybe issue #861 is not the best solution, as it would lead to a lot of duplicated code; maybe we need a list of classes that have headers, despite the general rule that new-style classes do not...

jpivarski avatar Mar 15 '23 18:03 jpivarski

After a conversation with Philippe, I learned that anything that is unsplit should have these 6-byte headers. TTime is not an exception. For some reason, we had concluded that the existence of these headers is variable (so we have an argument, in some places two for inner/outer, to configure it) with a default of expecting no headers. The existence of any unsplit branches without 6-byte headers should not be expected.

So I wonder what would happen if we toggle the default. If we have any failing tests with that, those test failures are counterexamples to the expectation that unsplit branches must have 6-byte headers. One of us should see how many test failures we have and drill down on a representative example file to see if it really is lacking num_bytes (4), class_version (2) before each object, or if we're—I don't know—making two mistakes that somehow cancel.

jpivarski avatar Mar 24 '23 15:03 jpivarski

@GiovanniVolta encountered this issue and reported it on Gitter:

Hi All, I received a root file for a Ge spectroscopy with this structure:

['Energy;1',
 'Energy/_F_EnergyCH0@DT5725S_14295;1',
 'Energy/Calibration_0;1',
 'Time;1',
 'Time/_F_TimeCH0@DT5725S_14295;1',
 'RealTime_0;1',
 'LiveTime_0;1']

I would like to get access via uproot to Energy/Calibration_0, RealTime and LiveTime but they have a strange formant and I don't understand how to open them. See here below:

print(file_HcomappsF['Energy/Calibration_0;1'], file_HcomappsF['RealTime_0;1'], file_HcomappsF['LiveTime_0;1'])
(<Unknown CalibrationCoefficient at 0x016bda34d250>,
 <Unknown TTime at 0x016bda328b90>,
 <Unknown TTime at 0x016bdf24cb90>)

Any suggestions ? Thank you in advance

Here is the file (gzipped):

HcompassF_226Ra_run_2_20231117_085722.root.gz

jpivarski avatar Nov 27 '23 17:11 jpivarski