Can not access REntry fields with DataVector in python
Check duplicate issues.
- [x] Checked for duplicates
Description
An error is reported when trying to access REntry fields of type related to DataVector (a class with a default template argument) in python:
entry["HLTNav_RepackedFeatures_MET"] # field of type: DataVector<xAOD::TrigMissingET_v1>
File "/cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Tue/ROOT/HEAD/x86_64-el9-gcc13-opt/lib/ROOT/_pythonization/_rntuple.py", line 28, in _REntry_getitem
ptr_proxy = self._CallGetPtr(key)
File "/cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Tue/ROOT/HEAD/x86_64-el9-gcc13-opt/lib/ROOT/_pythonization/_rntuple.py", line 24, in _REntry_CallGetPtr
return self._GetPtr[fieldType](key)
TypeError: Could not find "GetPtr<DataVector<xAOD::TrigMissingET_v1>>" (set cppyy.set_debug() for C++ errors):
Failed to instantiate "GetPtr<DataVector<xAOD::TrigMissingET_v1>>(::ROOT::RFieldToken&)"
Reproducer
# standard ATLAS Athena setup:
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh
asetup main--dev3LCG,latest,Athena
python
import ROOT
rnt = ROOT.RNTupleReader.Open("EventData", "/afs/cern.ch/user/m/mnowak/public/DAOD_PHYS.rntuple.pool.root")
entry = rnt.CreateEntry()
rnt.LoadEntry(1, entry)
entry["HLTNav_RepackedFeatures_MET"]
ROOT version
master
Installation method
LCG dev3
Operating system
Linux
Additional context
No response
Just to expand on what @Nowakus wrote, the use case is to read all the fields, something along the lines:
reader = ROOT.RNTupleReader.Open(
ROOT.TFile.Open(fileName).Get("MetaData")
)
entry = reader.CreateEntry()
reader.LoadEntry(0, entry)
for field in reader.GetDescriptor().GetTopLevelFields():
myObj = entry[field.GetFieldName()]
# extracting the data from myObj and writing to a dictionary
(or in other words, something what one can see via reader.Show(0) (there's always only one entry in such ntuple - it's for metadata), but it's not outputted to std::ostream (which is not captured by PyRoot, so needs calling e.g. subprocess.run(...)which is not very nice), but can be stored as dict).
When the loop above arrives at the DataVector field, an exception is thrown:
RException: Could not find "GetPtr<DataVector<xAOD::TrigMissingET_v1>>" (set cppyy.set_debug() for C++ errors):
shared_ptr<DataVector<xAOD::TrigMissingET_v1> > ROOT::REntry::GetPtr(ROOT::RFieldToken token) =>
RException: type mismatch for field HLTNav_RepackedFeatures_MET: DataVector<xAOD::TrigMissingET_v1> vs. DataVector<xAOD::TrigMissingET_v1,DataModel_detail::NoBase>
At:
void ROOT::REntry::EnsureMatchingType(ROOT::RFieldToken) const [T = DataVector<xAOD::TrigMissingET_v1, DataModel_detail::NoBase>] [/build/jenkins/workspace/lcg_nightly_pipeline/build/projects/ROOT-HEAD/src/ROOT-HEAD-build/include/ROOT/REntry.hxx:140]
the same that can be reproduced using https://gitlab.cern.ch/maszyman/rntuple-atlas-datavector (thus the issue does not seem to be limited to python).
Hi @vepadulano, @enirolf,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot:
Hi @vepadulano, @enirolf,
It appears this issue is closed, but wasn't yet added to a project. Please add upcoming versions that will include the fix, or 'not applicable' otherwise.
Sincerely, :robot:
Just to confirm https://github.com/root-project/root/pull/19087 fixes the ATLAS use case.
Thanks!
My example is unfortunately still failing in exactly the same way as described. Using this build: | Welcome to ROOT 6.37.01 https://root.cern | | Built for linuxx8664gcc on Jun 27 2025, 22:56:06 | | From heads/master@v6-37-01-7199-g0ba8001c8c |
But also interesting enough, after running the loop from Maciek it will start to work:
for field in rnt.GetDescriptor().GetTopLevelFields(): ... myObj = entry[field.GetFieldName()] ... entry["HLTNav_RepackedFeatures_MET"] <cppyy.gbl.DataVectorxAOD::TrigMissingET_v1 object at 0xb0779f0 held by std::shared_ptr<DataVectorxAOD::TrigMissingET_v1 > at 0x2bd0caa0>
@Nowakus can you remind me, are you defining the IsCollectionProxy type trait or have the using IsCollectionProxy = std::true_type; member type as described here:
https://github.com/root-project/root/blob/504130023e6cc46d38d3a26ada908036ff1bc945/tree/ntuple/inc/ROOT/RField/RFieldProxiedCollection.hxx#L251-L264
edit: the Experimental is of course a mistake, let me fix that...
As far as I can see the word "IsCollectionProxy" does not show up in our code anywhere.
I am not sure if this is the same, but we generate and install TGenCollectionProxy by hand for DataVectors - but that happens when dictionaries are loaded only.
As far as I can see the word "IsCollectionProxy" does not show up in our code anywhere.
I am not sure if this is the same, but we generate and install TGenCollectionProxy by hand for DataVectors - but that happens when dictionaries are loaded only.
Ok, that's a problem for the typed API (which we are using from Python) because the compiler chooses the wrong class hierarchy and then you will get DataVector has an associated collection proxy; use RProxiedCollectionField instead (at least I assume that's what you are seeing?). We need the type traits so that it works correctly.
The error is the one in the description at the very top
Ok indeed there seem to be more problems; with cppyy.set_debug():
lookup.funcname.file:1:8: error: too few template arguments for class template 'DataVector'
GetPtr<DataVector<xAOD::TrigMissingET_v1>>
^
input_line_154:1:44: note: template is declared here
template <typename T, typename BASE> class DataVector;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
@vepadulano
Can reproduce locally, now the issue is localised to the Python side only, and it's a different issue than the one fixed by the merged PR.
Hi @Nowakus ,
I have found a way to make your reproducer work, with what for now is a workaround. It looks like the issue stems from the way we try to instantiate the C++ template from the Python bindings. See the following working example run on lxplus within the ATHENA environment
import ROOT
rnt = ROOT.RNTupleReader.Open("EventData", "mnowak_file.root")
entry = rnt.CreateEntry()
rnt.LoadEntry(1, entry)
# This works
entry._GetPtr[ROOT.DataVector[ROOT.xAOD.TrigMissingET_v1]]("HLTNav_RepackedFeatures_MET")
Notice that I'm passing to template instantiation _GetPtr the Python proxy of the class type. It looks like at the moment the instantiation via the type name string is not doing the same amount of work. Currently investigating this.
I see the same. Thanks for the update.
Hi, another update. I think I can now explain why we're seeing this issue. Unfortunately, I cannot yet provide a solution.
Let's start with an important clarification. The Python bindings are designed to be lazy whenever possible. If a certain ROOT class/function/attribute is not requested via the Python bindings, it won't be loaded. In this particular scenario, the main difference between the following (class names taken from my local reproducer of the first part of this issue, now available as a test at https://github.com/root-project/root/blob/master/roottest/root/ntuple/atlas-datavector/AtlasLikeDataVector.hxx)
ROOT.foo["AtlasLikeDataVector<CustomStruct>"]
ROOT.foo[ROOT.AtlasLikeDataVector[ROOT.CustomStruct]]
Is that in the second case, by retrieving a Python proxy to the AtlasLikeDataVector and CustomStruct classes, we're actively asking ROOT to populate the related information of these classes in the typesystem. In the first case instead, the string doesn't immediately correspond to a request, so the loading of the class information is treated lazily.
Once the function is being tried for instantiation by cppyy, at some point it enters TemplateProxy::Instantiate which tries to instantiate the real C++ template for the function. In the case of the argument passed by string, this fails. The reason is that AtlasLikeDataVector<CustomStruct> was not autoloaded before, which instead happens with ROOT.AtlasLikeDataVector[ROOT.CustomStruct].
One idea I'm testing right now is to have TemplateProxy::Instantiate retry in case of first failure of Cppyy::GetMethodTemplate and actually load the class information before trying to instantiate the template. Keeping aside for a moment the fact that the string manipulation of a full function template signature is shaky at best, I've started investigating what happens when calling TClass::GetClass("AtlasLikeDataVector<CustomStruct>") right before the new call to GetMethodTemplate. To my surprise, TClass immediately finds AtlasLikeDataVector<CustomStruct> via TClassTable::GetDictNorm(name) and early-exits. Thus, no autoloading happens, but it should happen in this case to make it work.
So, somehow, the information about AtlasLikeDataVector<CustomStruct> was loaded in the typesystem, but only partially/inconclusively. This happens as follows.
Back in the first call of GetMethodTemplate inside of TemplateProxy::Instantiate there is a call to TCling::GetFunctionWithPrototype to get the function signature corresponding to the string "foo<AtlasLikeDataVector<CustomStruct>>". This eventually calls into TSystem::Load and somehow TCling::AutoLoad("CustomStruct"). That is, only the information about the innermost part of the template is AutoLoaded, not AtlasLikeDataVector. But, since the loading is happening inside of the same library generated from the dictionary source, the AtlasLikeDataVector class is still loaded and cached in the list of classes available to ROOT. This is how it is then immediately found by TClass::GetClass. I attach a full stacktrace of this part.
I thought about getting the real normalized name of AtlasLikeDataVector<CustomStruct> which should be AtlasLikeDataVector<CustomStruct, DataModel_detail::NoBase> via TClassEdit::GetNormalizedName. That fails for the same reason: the class is already available, so the name never gets really normalized, the function returns early.
@pcanal Somehow I think TClass::GetClass should be able to detect this situation, call AutoLoad for AtlasLikeDataVector<CustomStruct>, but I don't know yet how.
Thank you for confirming and for your help with the reproducers! We can now close this issue