uproot3 icon indicating copy to clipboard operation
uproot3 copied to clipboard

Reading Allpix² data

Open ClundXIII opened this issue 4 years ago • 9 comments

Hi,

I am fairly new to root and I am trying to extract data without having to mess with roots C++-ish language. This is what I came up with so far:

#!/usr/bin/env python3

import sys
import numpy
import uproot
from uproot import interpret, asgenobj, STLVector, Pointer

rdata = uproot.open("./data.root")

data = rdata["PixelHit/timepix"]

print(data.interpretation)
print(data)
print(data.show())
print(data.array(asgenobj(STLVector(Pointer("allpix_3a3a_PixelHit")))))

An example file can be found here: https://cloud.clundxiii.com/cloud/s/c4dQf5bERYHfsMf

I want to get the contents of _signal or getSignal() as an array or via an iterator (I would simply print it and pipe it wherever I want).

Allpix² should already be implemented somehow according to https://github.com/scikit-hep/uproot/issues/485

is this published to the repository yet? I got the uproot package via pip3.

ClundXIII avatar Jun 20 '20 19:06 ClundXIII

Can you post the output of the code above?

tamasgal avatar Jun 21 '20 01:06 tamasgal

Can you post the output of the code above?

asgenobj(STLVector(Pointer(allpix_3a3a_PixelHit)))
<TBranchElement b'timepix' at 0x7f9327e21128>
timepix                    TStreamerSTL               asgenobj(STLVector(Pointer(allpix_3a3a_PixelHit)))
None
Traceback (most recent call last):
  File "path/to/extractFromRoot.py", line 15, in <module>
    print(data.array(asgenobj(STLVector(Pointer("allpix_3a3a_PixelHit")))))
TypeError: __init__() missing 2 required positional arguments: 'context' and 'skipbytes'

ClundXIII avatar Jun 21 '20 06:06 ClundXIII

The problem is that this par in allpix_3a3a_PixelHit could not be parsed:

        _raise_notimplemented('TStreamerSTL', "{'_classversion': 4, '_fOffset': 0, '_fName': b'mc_particles_', '_fTitle': b'', '_fType': 500, '_fSize': 24, '_fArrayLength': 0, '_fArrayDim': 0, '_fMaxIndex': array([0, 0, 0, 0, 0], dtype=int32), '_fTypeName': b'vector<TRef>', '_fXmin': 0.0, '_fXmax': 0.0, '_fFactor': 0.0, '_fSTLtype': 1, '_fCtype': 61}", source, cursor)

Do you have the source code for allpix::PixelHit?

tamasgal avatar Jun 21 '20 10:06 tamasgal

Do you have the source code for allpix::PixelHit?

This should be it: https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/PixelHit.hpp

ClundXIII avatar Jun 21 '20 17:06 ClundXIII

This is the relevant part of the generated interpretation code from uproot:

        allpix_3a3a_Object._readinto(self, source, cursor, context, parent)
        self._pixel_5f_ = allpix_3a3a_Pixel.read(source, cursor, context, self)
        self._time_5f_, self._signal_5f_ = cursor.fields(source, cls._format1)
        self._pixel_5f_charge_5f_ = TRef.read(source, cursor, context, self)
        _raise_notimplemented('TStreamerSTL', "{'_classversion': 4, '_fOffset': 0, '_fName': b'mc_particles_', '_fTitle': b'', '_fType': 500, '_fSize': 24, '_fArrayLength': 0, '_fArrayDim': 0, '_fMaxIndex': array([0, 0, 0, 0, 0], dtype=int32), '_fTypeName': b'vector<TRef>', '_fXmin': 0.0, '_fXmax': 0.0, '_fFactor': 0.0, '_fSTLtype': 1, '_fCtype': 61}", source, cursor)

As written above, the missing part is that TStreamerSTL. The lines above seem ok. The first line correctly reads the allpix::Object which is the base class (https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/PixelHit.hpp#L28) . In the second line the allpix::Pixel field is parsed, which contains just a few usual ROOT coordinate objects.

The next line reads the time and the signal, see here (https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/PixelHit.hpp#L37):

PixelHit(Pixel pixel, double time, double signal, const PixelCharge* pixel_charge = nullptr);

and then it fails to parse the PixelCharge structure which also has a std::vector of PropagatedCharges and deeper down there is also a DepositedCharge. I do not see any traces of code generated for these classes in uproot.

Here is the original PixelCharge signature (https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/PixelCharge.cpp#L17):

PixelCharge::PixelCharge(
    Pixel pixel,
    unsigned int charge,
    const std::vector<const PropagatedCharge*>& propagated_charges)

and the PropagatedCharge (https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/PropagatedCharge.cpp#L18):


PropagatedCharge::PropagatedCharge(ROOT::Math::XYZPoint local_position,
                                   ROOT::Math::XYZPoint global_position,
                                   CarrierType type,
                                   unsigned int charge,
                                   double event_time,
                                   const DepositedCharge* deposited_charge)

and last but not least the DepositedCharge (https://gitlab.cern.ch/allpix-squared/allpix-squared/-/blob/master/src/objects/DepositedCharge.cpp#L16):

DepositedCharge::DepositedCharge(ROOT::Math::XYZPoint local_position,
                                 ROOT::Math::XYZPoint global_position,
                                 CarrierType type,
                                 unsigned int charge,
                                 double event_time,
                                 const MCParticle* mc_particle)

Here you can see MCParticle which on the other hand has been successfully parse by uproot.

I assume that uproot has problems with the std::vector of that type since there are some not-yet-understood structures in these container types (as far as I am concerned), but @jpivarski is definitely the expert here.

Uproot 4 will provide more tools to dig deeper, but if you want to have some binary adventures, you can fire up a hex editor and examine the dump of one entry with all our knowledge we collected so far. You can get access to the raw data like this (note that at the end I dump it to a file which can then be bisected in a hex-editor of your choice):

In [1]: import uproot

In [2]: import numpy as np

In [3]: f = uproot.open("data.root")

In [4]: data = f["PixelHit/timepix"].array(uproot.asdebug)

In [5]: list(map(len, data))[:15]  # to see "non-empty" entries which have more than 1
   ...: 0 bytes, the usual header size
Out[5]: [10, 10, 10, 10, 10, 837, 813, 301, 10, 10, 563, 10, 10, 10, 10]

In [6]: data[7]
Out[6]:
array([ 64,   0,   1,  41,   0,   9,   0,   0,   0,   1,  64,   0,   1,
        31, 255, 255, 255, 255,  97, 108, 108, 112, 105, 120,  58,  58,
        80, 105, 120, 101, 108,  72, 105, 116,   0,  64,   0,   1,   6,
         0,   4,  64,   0,   0,  12,   0,   2,   0,   1,   0,   0,   0,
         0,   2,   0,   0,   0,  64,   0,   0, 154,   0,   1,  64,   0,
         0,  24,   0,   0, 240, 243, 111,  43,  64,   0,   0,  14,   0,
         0, 153, 176,  17, 183,   0,   0,   0, 128,   0,   0,   0, 138,
        64,   0,   0,  40,   0,   0, 229, 102, 217, 110,  64,   0,   0,
        30,   0,   0, 133, 176,  44, 232,  64,  28,  40, 245, 194, 143,
        92,  41,  64,  30,  92,  40, 245, 194, 143,  92, 191, 208,   0,
         0,   0,   0,   0,   0,  64,   0,   0,  40,   0,   0, 229, 102,
       217, 110,  64,   0,   0,  30,   0,   0, 133, 176,  44, 232,  63,
       156,  40, 245, 194, 143,  92,   0,  63, 226, 122, 225,  71, 174,
        20, 120, 191, 208,   0,   0,   0,   0,   0,   0,  64,   0,   0,
        32,   0,   0, 246, 178, 128,  23,  64,   0,   0,  22,   0,   0,
       150, 195,  36, 188,  63, 172,  40, 245, 194, 143,  92,  41,  63,
       172,  40, 245, 194, 143,  92,  41,   0,   0,   0,   0,   0,   0,
         0,   0,  64, 179, 251,   0,   0,   0,   0,   0,   0,   1,   0,
         0,   0,  96,   2,   0,   0,   0,   0,   0,  64,   0,   0,  54,
         0,   9,   0,   0,   0,   4,   0,   1,   0,   0,   0,   5,   2,
         0,   0,   0,   0,   0,   0,   1,   0,   0,   0,   6,   2,   0,
         0,   0,   0,   0,   0,   1,   0,   0,   0,   7,   2,   0,   0,
         0,   0,   0,   0,   1,   0,   0,   0,   8,   2,   0,   0,   0,
         0,   0], dtype=uint8)

In [7]: data_hex = np.array(data[7]).tobytes()

In [8]: data_hex
Out[8]: b'@\x00\x01)\x00\t\x00\x00\x00\x01@\x00\x01\x1f\xff\xff\xff\xffallpix::PixelHit\x00@\x00\x01\x06\x00\x04@\x00\x00\x0c\x00\x02\x00\x01\x00\x00\x00\x00\x02\x00\x00\x00@\x00\x00\x9a\x00\x01@\x00\x00\x18\x00\x00\xf0\xf3o+@\x00\x00\x0e\x00\x00\x99\xb0\x11\xb7\x00\x00\x00\x80\x00\x00\x00\x8a@\x00\x00(\x00\x00\xe5f\xd9n@\x00\x00\x1e\x00\x00\x85\xb0,\xe8@\x1c(\xf5\xc2\x8f\\)@\x1e\\(\xf5\xc2\x8f\\\xbf\xd0\x00\x00\x00\x00\x00\x00@\x00\x00(\x00\x00\xe5f\xd9n@\x00\x00\x1e\x00\x00\x85\xb0,\xe8?\x9c(\xf5\xc2\x8f\\\x00?\xe2z\xe1G\xae\x14x\xbf\xd0\x00\x00\x00\x00\x00\x00@\x00\x00 \x00\x00\xf6\xb2\x80\x17@\x00\x00\x16\x00\x00\x96\xc3$\xbc?\xac(\xf5\xc2\x8f\\)?\xac(\xf5\xc2\x8f\\)\x00\x00\x00\x00\x00\x00\x00\x00@\xb3\xfb\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00`\x02\x00\x00\x00\x00\x00@\x00\x006\x00\t\x00\x00\x00\x04\x00\x01\x00\x00\x00\x05\x02\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x06\x02\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x07\x02\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x08\x02\x00\x00\x00\x00\x00'

In [9]: with open("dump.dat", "wb") as fobj:
    ...:     fobj.write(data_hex)

tamasgal avatar Jun 21 '20 18:06 tamasgal

Hi @tamasgal @jpivarski, any updates on that? I guess the issue is still not solved? I am asking because I have the same problem.

YannickDieter avatar Oct 10 '21 20:10 YannickDieter

Have you tried it with uproot4? The code-gen was greatly improved in the new version :)

tamasgal avatar Oct 10 '21 22:10 tamasgal

Yes I tried it now with uproot4, and I came much closer to accessing the data. Now I am stuck with accessing the data of this object: <allpix::PixelHit (version 5) at 0x7f1cc25f7d50>

YannickDieter avatar Oct 11 '21 15:10 YannickDieter

Nevermind, I solved it using the members() method.

YannickDieter avatar Oct 11 '21 16:10 YannickDieter