root icon indicating copy to clipboard operation
root copied to clipboard

Yes another example of default template argument causing problems

Open Nowakus opened this issue 6 months ago • 2 comments

Check duplicate issues.

  • [x] Checked for duplicates

Description

Calling RNTupleReader::GetView() for a column containing a type with a default template argument but using a full type name (or typeinfo), results in an abort.

tracked in: https://its.cern.ch/jira/browse/ATEAM-1087

Reproducer

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh asetup main--dev3LCG,latest,Athena root

auto reader = ROOT::RNTupleReader::Open("EventData", "/afs/cern.ch/user/m/mnowak/public/DAOD_PHYS.rntuple.pool.root"); auto view = reader->GetView("EventInfoAuxDyn:hardScatterVertexLink", nullptr, "ElementLink<DataVectorxAOD::Vertex_v1,DataModel_detail::NoBase >");

Fatal: expandedTemplateArgs.size() >= templateArgs.size() violated at line 314 of `/build/jenkins/workspace/lcg_nightly_pipeline/build/projects/ROOT-HEAD/src/ROOT/HEAD/tree/ntuple/src/RFieldUtils.cxx' aborting

ROOT version

ROOT 6.37.01 from May 30 2025, 23:23:40

Installation method

LCG dev3 nightly

Operating system

Linux

Additional context

No response

Nowakus avatar Jun 02 '25 14:06 Nowakus

To clarify a bit - we actually don't use string type names, like in my example. That was just to make it easy to see what is going on. In our code we use type_info, which we pass to GetView() - but the type_info always has the default template argument spelled out, like this:

root  [9] TClass *tc = TClass::GetClass("ElementLink<DataVector<xAOD::Vertex_v1>")
root [10] tc->GetName()
(const char *) "ElementLink<DataVector<xAOD::Vertex_v1> >"
root [11] tc->GetTypeInfo()->name()
(const char *) "11ElementLinkI10DataVectorIN4xAOD9Vertex_v1EN16DataModel_detail6NoBaseEEE"
root [12] .!c++filt -t 11ElementLinkI10DataVectorIN4xAOD9Vertex_v1EN16DataModel_detail6NoBaseEEE
ElementLink<DataVector<xAOD::Vertex_v1, DataModel_detail::NoBase> >

This is probably how GetView() gets conflicting typename expansions.

Nowakus avatar Jun 03 '25 09:06 Nowakus

I think the core difference is that TClass is happy with all possible forms of this type (short name, full name and type_info) but RNTuple only accepts the short name version (see below) Anyhow, I was able to switch Athena to use the short typenames (and Attila claims that's the right thing to do in any case). So we will not be unhappy if you postpone this or decide to not fix at all.

root [0] TClass::GetClass( "ElementLink<DataVector<xAOD::Vertex_v1, DataModel_detail::NoBase> >)" )->GetName();
(const char *) "ElementLink<DataVector<xAOD::Vertex_v1> >"

root [2] ROOT::RFieldBase::Create("f1", "ElementLink<DataVector<xAOD::Vertex_v1, DataModel_detail::NoBase> >").Unwrap();
Fatal: expandedTemplateArgs.size() >= templateArgs.size() violated at line 314 of `/build/jenkins/workspace/lcg_nightly_pipeline/build/projects/ROOT-HEAD/src/ROOT/HEAD/tree/ntuple/src/RFieldUtils.cxx'
aborting

Nowakus avatar Jun 06 '25 12:06 Nowakus

Can still reproduce using the lines in the issue description. Also fyi @enirolf @vepadulano

hahnjo avatar Jul 11 '25 12:07 hahnjo

On the RNTuple side, this has been fixed by #19323

The reproducer (when compiled) and the lines of https://github.com/root-project/root/issues/18935#issuecomment-2949122883 now work.

The reproducer in the interactive prompt still crash with a segfault but that seems cling related. In fact, on the prompt even just

ElementLink<DataVector<xAOD::Vertex_v1,DataModel_detail::NoBase> > el;

crashes.

@enirolf I reassign this to you as ATLAS liaison. Please feel free to forward the ticket as appropriate.

jblomer avatar Sep 09 '25 10:09 jblomer

The reproducer in the interactive prompt still crash with a segfault but that seems cling related. In fact, on the prompt even just

ElementLink<DataVector<xAOD::Vertex_v1,DataModel_detail::NoBase> > el;

crashes.

@hahnjo was this fixed by https://github.com/root-project/root/pull/19892 ? Not sure if this issue can be closed now, too?

ferdymercury avatar Sep 19 '25 10:09 ferdymercury

The original reproducer now also works on the prompt, but just creating an instance on the prompt:

ElementLink<DataVector<xAOD::Vertex_v1,DataModel_detail::NoBase> > el;

does not. This is probably a different issue.

I don't know how we want to handle this, maybe this issue can stay RNTuple focused (and thus be closed)?

hahnjo avatar Sep 19 '25 10:09 hahnjo

This is probably a different issue.

Side note: other DataVector-related issues: https://github.com/root-project/root/issues/14186 https://github.com/root-project/root/issues/15996

ferdymercury avatar Sep 19 '25 11:09 ferdymercury