UnROOT.jl
UnROOT.jl copied to clipboard
Support for custom branches that contain std vectors of custom structs?
I have a root file with a custom-type branch whose elements contain structs that contain vectors of custom structs, e.g.
struct Foo
{
long a;
std::vector<short> b;
};
struct Bar
{
long c;
std::vector<Foo> d;
};
The file uses standard ROOT autogenerated streamers. I'm trying to read it using
struct Foo
a::Clong
b::Vector{Cshort}
end
struct Bar
d::Clong
e::Vector{Foo}
end
f = ROOTFile("myfile.root", customstructs = Dict("Foo" => Foo, "Bar" => Bar))
tree = LazyTree(f, "TreeOnFire", ["bar_branch"]);
tree[1].bar_branch; # fails
but I get
julia> tree[1].bar_branch;
ERROR: MethodError: no method matching -(::Nothing, ::Int64)
[...]
Stacktrace:
[1] _localindex_newbasket!(ba::LazyBranch{Plane, UnROOT.Nojagg, Vector{Plane}}, idx::Int64, tid::Int64)
[...]
Should this work or can't we handle custom structs like that automatically yet?
This can be a bit tricky. We don't have much (read: any) automatisation yet for custom stuff. There are different ways of doing it, maybe you check out how I do it for some of the KM3NeT datastructures here (we included that into UnROOT and its test suite for documentation purposes): https://github.com/JuliaHEP/UnROOT.jl/blob/master/test/runtests.jl#L439
The parsing action is defined here: https://github.com/JuliaHEP/UnROOT.jl/blob/master/src/custom.jl#L145
As you can see, it might require some manual bit-hopping. If you can provide a sample data, I can help you out.
Thanks @tamasgal, much appreciated! Adding a bit of bit-mangling code shouldn't be a problem. So I basically implement readtype
and interped_data
for the custom types, right? How do I read/iterate over std::vector
in those?
For the std::vector
, you need to skip the magical 10 bytes at the beginning and then use the UnROOT.readtype(io, Cshort)
function. It's similar to read()
but changes the byte order (ROOT is big endian).
It might need some trial and error, let me know if you need further help, but I think it should be fairly straight forward. ;)
the documentation is between https://juliahep.github.io/UnROOT.jl/dev/advanced/custom_branch/ and the src/custom.jl
basically, you want to implement a function
function interped_data(rawdata, rawoffsets, ::Type{Vector{LVF64}}, ::Type{Offsetjagg})
but with your own type instead of LVF64
I am having some trouble figuring this out. Could someone help?
I basically have a std::vector<std::vector<int>>
in a root tree I'm trying to read, which, I didn't think would be too bad since the TLorentzVector is a a vector of 4-vectors too... In my case, the length of the vectors in each event is different.
I tried doing the following:
customstruct = Dict("VecVecInt" => Vector{Vector{Int32}})
const VecVecInt = customstruct
function interped_data(rawdata, rawoffsets, ::Type{Vector{Vector{Int32}}}, ::Type{Offsetjagg})
_size = 64 # needs to account for 32 bytes header
dp = 0 # book keeping for copy_to!
lr = length(rawoffsets)
offset = Vector{Int32}(undef, lr)
offset[1] = 0
@views @inbounds for i in 1:lr-1
start = rawoffsets[i]+10+1
stop = rawoffsets[i+1]
l = stop-start+1
if l > 0
unsafe_copyto!(rawdata, dp+1, rawdata, start, l)
dp += l
offset[i+1] = offset[i] + l
else
offset[i+1] = offset[i]
end
end
resize!(rawdata, dp)
real_data = interped_data(rawdata, offset, VecVecInt, Nojagg)
offset .รท= _size
offset .+= 1
VectorOfVectors(real_data, offset)
end
The error I get when running:
data, offsets = UnROOT.array(f, "Tree/Event/PMTBinnedWaveforms", raw=true)
, where PMTBinnedWaveforms
is the std::vector<std::vector<int>>
I am trying to read.
is
MethodError: no method matching ROOTFile(::String, ::Dict{String, DataType})
Closest candidates are:
ROOTFile(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
@ UnROOT ~/.julia/packages/UnROOT/mBdWz/src/root.jl:13
ROOTFile(::Function, ::Any...; pv...)
@ UnROOT ~/.julia/packages/UnROOT/mBdWz/src/root.jl:25
ROOTFile(::String, ::Int32, ::Union{UnROOT.FileHeader32, UnROOT.FileHeader64}, ::Union{UnROOT.HTTPStream, UnROOT.MmapStream, UnROOT.XRDStream}, ::Union{UnROOT.TKey32, UnROOT.TKey64}, ::UnROOT.Streamers, ::UnROOT.ROOTDirectory, ::Dict{String, Type})
@ UnROOT ~/.julia/packages/UnROOT/mBdWz/src/root.jl:13
...
Stacktrace:
[1] top-level scope
@ In[6]:2
I am really stuck. Could someone help?
Ahm, do you have an example file? That should work "out-of-the-box" ๐
Sure, actually here is one: https://drive.google.com/drive/folders/1qLURkYheLkdwoEj_tyGLG7JsV6wShSGt?usp=sharing
I'm trying to read PMTBinnedWaveforms
and PMTWaveforms
under ODTree
.
julia> ROOTFile("/tmp/VetoPMTAnalysis_000.root")["ODTree"]
ODTree (TTree)
โโ "ODEvent"
julia> ROOTFile("/tmp/VetoPMTAnalysis_000.root")["ODTree"]["ODEvent"]
ODEvent
โโ TObject
โ โโ fUniqueID
โ โโ fBits
โโ eventNumber
โโ muImpactParameter
โโ LXeImpactParameter
โโ muTrackLength
โโ muEnergy
โโ totalHits
โโ totalHitsPreQE
โโ initCherenkovOP
โโ PMTIDVec
โโ PMTWaveforms
โโ PMTBinnedWaveforms
โโ PMTTriggerVec
so your TTree contains custom struct, in this case it's tricky
It's reading
fClassName: String "ODPMTDS"
fParentName: String "ODPMTDS"
and you can check the streamer for that class with UnROOT.streamerfor(f, "ODPMTDS")
(see below the output).
The problem is that the branch splitting is limited in your case (default is 99, which means that you basically have a ROOT branch with a corresponding path for each field), so that you need a parser which is able to parse the whole class instance. This means that you are not able to read e.g. only a single field PMTBinnedWaveforms
of the ODPMTDS
, you need to deserialise everything. ๐
julia> UnROOT.streamerfor(f, "ODPMTDS")
UnROOT.StreamerInfo(UnROOT.TStreamerInfo{UnROOT.TObjArray}("ODPMTDS", "", 0x14fb5c22, 1, UnROOT.TObjArray("", 0, Any[UnROOT.TStreamerBase
version: UInt16 0x0004
fOffset: Int64 0
fName: String "TObject"
fTitle: String "Basic ROOT object"
fType: Int32 66
fSize: Int32 0
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, -1877229523, 0, 0, 0]
fTypeName: String "BASE"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
fBaseVersion: Int32 1
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "eventNumber"
fTitle: String ""
fType: Int32 3
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "int"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "muImpactParameter"
fTitle: String ""
fType: Int32 5
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "float"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "LXeImpactParameter"
fTitle: String ""
fType: Int32 5
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "float"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "muTrackLength"
fTitle: String ""
fType: Int32 5
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "float"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "muEnergy"
fTitle: String ""
fType: Int32 5
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "float"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "totalHits"
fTitle: String ""
fType: Int32 3
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "int"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "totalHitsPreQE"
fTitle: String ""
fType: Int32 3
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "int"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerBasicType
version: UInt16 0x0004
fOffset: Int64 0
fName: String "initCherenkovOP"
fTitle: String ""
fType: Int32 3
fSize: Int64 4
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "int"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
, UnROOT.TStreamerSTL
version: UInt16 0x0004
fOffset: Int64 0
fName: String "PMTIDVec"
fTitle: String ""
fType: Int32 500
fSize: Int32 24
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "vector<int>"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
fSTLtype: Int32 1
fCtype: Int32 3
, UnROOT.TStreamerSTL
version: UInt16 0x0004
fOffset: Int64 0
fName: String "PMTWaveforms"
fTitle: String "All hits on PMTs"
fType: Int32 500
fSize: Int32 24
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "vector<vector<float> >"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
fSTLtype: Int32 1
fCtype: Int32 61
, UnROOT.TStreamerSTL
version: UInt16 0x0004
fOffset: Int64 0
fName: String "PMTBinnedWaveforms"
fTitle: String ""
fType: Int32 500
fSize: Int32 24
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "vector<vector<int> >"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
fSTLtype: Int32 1
fCtype: Int32 61
, UnROOT.TStreamerSTL
version: UInt16 0x0004
fOffset: Int64 0
fName: String "PMTTriggerVec"
fTitle: String ""
fType: Int32 500
fSize: Int32 24
fArrayLength: Int32 0
fArrayDim: Int32 0
fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
fTypeName: String "vector<vector<int> >"
fXmin: Float64 0.0
fXmax: Float64 0.0
fFactor: Float64 0.0
fSTLtype: Int32 1
fCtype: Int32 61
])), Set(Any["TObject"]))
Python uproot can parse it
In [16]: up
Out[16]: <module 'uproot' from '/home/akako/.conda/envs/hep/lib/python3.11/site-packages/uproot/__init__.py'>
In [17]: r = up.open("/tmp/VetoPMTAnalysis_000.root")["ODTree"].arrays()
In [18]: r.PMTBinnedWaveforms[0]
Out[18]: <Array [[0, 0, 0, 0, 0, 0, ..., 0, 0, 0, 0, 0], ...] type='472 * var * int32'>
but I don't think we can do much here at the moment, parsing arbitrary C++ class without maximal splitting is too hard for now.
if you convert the TTree to RNTuple, we should be able to read that easily
Automatic parsing of custom stuff is definitely on the big todo list, but I am totally overloaded ๐ still hoping that a few more contributors jump in soon ๐
Yeah, I was using Python UpROOT before but stumbled on, and really like, UnROOT hence the potential swap over.
Thanks for the help! I'll try converting to an RNTuple and see, I don't really need the other TTree right now anyway.
Or set the branch splitting to 99 ;)