UnROOT.jl icon indicating copy to clipboard operation
UnROOT.jl copied to clipboard

Native Julia I/O package to work with CERN ROOT files objects (TTree and RNTuple)

UnROOT.jl

All Contributors

JOSS Dev Build Status Codecov

UnROOT.jl is a reader for the CERN ROOT file format written entirely in Julia, without any dependence on ROOT or Python.

Installation Guide

  1. Download the latest Julia release
  2. Open up Julia REPL (hit ] once to enter Pkg mode, hit backspace to exit it)
julia>]
(v1.8) pkg> add UnROOT

Quick Start (see docs for more)

julia> using UnROOT

julia> f = ROOTFile("test/samples/NanoAODv5_sample.root")
ROOTFile with 2 entries and 21 streamers.
test/samples/NanoAODv5_sample.root
└─ Events
   ├─ "run"
   ├─ "luminosityBlock"
   ├─ "event"
   ├─ "HTXS_Higgs_pt"
   ├─ "HTXS_Higgs_y"
   └─ "⋮"

julia> mytree = LazyTree(f, "Events", ["Electron_dxy", "nMuon", r"Muon_(pt|eta)$"])
 Row │ Electron_dxy     nMuon   Muon_eta         Muon_pt
     │ Vector{Float32}  UInt32  Vector{Float32}  Vector{Float32}
─────┼───────────────────────────────────────────────────────────
 1   │ [0.000371]       0       []               []
 2   │ [-0.00982]       2       [0.53, 0.229]    [19.9, 15.3]
 3   │ []               0       []               []
 4   │ [-0.00157]       0       []               []
 ⋮   │     ⋮            ⋮             ⋮                ⋮
 

You can iterate through a LazyTree:

julia> for event in mytree
           @show event.Electron_dxy
           break
       end
event.Electron_dxy = Float32[0.00037050247]

julia> Threads.@threads for event in mytree # multi-threading
           ...
       end

Only one basket per branch will be cached so you don't have to worry about running out of RAM. At the same time, event inside the for-loop is not materialized until a field is accessed. If your event is fairly small or you need all of them anyway, you can collect(event) first inside the loop.

XRootD is also supported, depending on the protocol:

  • the "url" has to start with http:// or https://:
  • (1.6+ only) or the "url" has to start with root:// and have another // to separate server and file path
julia> r = @time ROOTFile("https://scikit-hep.org/uproot3/examples/Zmumu.root")
  0.034877 seconds (5.13 k allocations: 533.125 KiB)
ROOTFile with 1 entry and 18 streamers.

julia> r = ROOTFile("root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root")
ROOTFile with 1 entry and 19 streamers.

Branch of custom struct

We provide an experimental interface for hooking up UnROOT with your custom types that only takes 2 steps, as explained in the docs. As a show case for this functionality, the TLorentzVector support in UnROOT is implemented with the said plug-in system.

Support & Contributiing

  • Use Github issues for any bug reporting or feature request; feel free to make PRs, bug fixing, feature tuning, quality of life, docs, examples etc.
  • See CONTRIBUTING.md for more information and recommended workflows in contributing to this package.

TODOs

  • [x] Parsing the file header
  • [x] Read the TKeys of the top level dictionary
  • [x] Reading the available trees
  • [x] Reading the available streamers
  • [x] Reading a simple dataset with primitive streamers
  • [x] Reading of raw basket bytes for debugging
  • [ ] Automatically generate streamer logic
  • [x] Prettier show for Lazy*s
  • [ ] Clean up Cursor use
  • [x] Reading TNtuple #27
  • [x] Reading histograms (TH1D, TH1F, TH2D, TH2F, etc.) #48
  • [ ] Clean up the readtype, unpack, stream! and readobjany construct
  • [ ] Refactor the code and add more docs
  • [ ] Class name detection of sub-branches
  • [ ] High-level histogram interface

Acknowledgements

Special thanks to Jim Pivarski (@jpivarski) from the Scikit-HEP project, who is the main author of uproot, a native Python library to read and write ROOT files, which was and is a great source of inspiration and information for reverse engineering the ROOT binary structures.

Behind the scene

Some additional debug output:

julia> using UnROOT

julia> f = ROOTFile("test/samples/tree_with_histos.root")
Compressed stream at 1509
ROOTFile("test/samples/tree_with_histos.root") with 1 entry and 4 streamers.

julia> keys(f)
1-element Array{String,1}:
 "t1"

julia> keys(f["t1"])
Compressed datastream of 1317 bytes at 1509 (TKey 't1' (TTree))
2-element Array{String,1}:
 "mynum"
 "myval"

julia> f["t1"]["mynum"]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
UnROOT.TBranch
  cursor: UnROOT.Cursor
  fName: String "mynum"
  fTitle: String "mynum/I"
  fFillColor: Int16 0
  fFillStyle: Int16 1001
  fCompress: Int32 101
  fBasketSize: Int32 32000
  fEntryOffsetLen: Int32 0
  fWriteBasket: Int32 1
  fEntryNumber: Int64 25
  fIOFeatures: UnROOT.ROOT_3a3a_TIOFeatures
  fOffset: Int32 0
  fMaxBaskets: UInt32 0x0000000a
  fSplitLevel: Int32 0
  fEntries: Int64 25
  fFirstEntry: Int64 0
  fTotBytes: Int64 170
  fZipBytes: Int64 116
  fBranches: UnROOT.TObjArray
  fLeaves: UnROOT.TObjArray
  fBaskets: UnROOT.TObjArray
  fBasketBytes: Array{Int32}((10,)) Int32[116, 0, 0, 0, 0, 0, 0, 0, 0, 0]
  fBasketEntry: Array{Int64}((10,)) [0, 25, 0, 0, 0, 0, 0, 0, 0, 0]
  fBasketSeek: Array{Int64}((10,)) [238, 0, 0, 0, 0, 0, 0, 0, 0, 0]
  fFileName: String ""


julia> seek(f.fobj, 238)
IOStream(<file test/samples/tree_with_histos.root>)

julia> basketkey = UnROOT.unpack(f.fobj, UnROOT.TKey)
UnROOT.TKey64(116, 1004, 100, 0x6526eafb, 70, 0, 238, 100, "TBasket", "mynum", "t1")

julia> s = UnROOT.datastream(f.fobj, basketkey)
Compressed datastream of 100 bytes at 289 (TKey 'mynum' (TBasket))
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=100, maxsize=Inf, ptr=1, mark=-1)

julia> [UnROOT.readtype(s, Int32) for _ in 1:f["t1"]["mynum"].fEntries]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
25-element Array{Int32,1}:
  0
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 10
 10
 10
 10

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Tamas Gal

💻 📖 🚇 🔣 ⚠️

Jerry Ling

💻 ⚠️ 🔣 📖

Johannes Schumann

💻 ⚠️ 🔣

Nick Amin

💻 ⚠️ 🔣

Mosè Giordano

🚇

Oliver Schulz

🤔

Misha Mikhasenko

🔣

This project follows the all-contributors specification. Contributions of any kind welcome!