[cling] Memory hogging when checking if type is an enum
Initially reported here:
https://root-forum.cern.ch/t/pyroot-and-std-vector-dramatic-ram-usage/49711
when a user was getting a vector<int> branch from a tree in a loop in PyROOT.
After a bit of digging, it seems the culprit is a call that is made from cppyy to TInterpreter::ClassInfo_IsEnum. The following is a C++ reproducer (tested with current master):
float getMem() {
auto info = ProcInfo_t();
gSystem->GetProcInfo(&info);
float mem = (float)info.fMemResident;
return mem*1e-3;
}
void test()
{
for (int i=0; i < 100000; ++i) {
gInterpreter->ClassInfo_IsEnum("vector<int>");
if (i%1000 == 0)
printf("RSS at iteration %d is %f\n", i, getMem());
}
}
When running the macro above, it can be seen how the memory usage increases after each iteration.
Related (or maybe the exact same issue): https://github.com/root-project/root/issues/9029
I don't know if it is an option at that place in the code but calling a combination of
TClass::GetClass(name_of_type);
and
TEnum::GetEnum(name_of_type);
should be more memory efficient.
Dear experts, @etejedor,
I think I am facing a similar memory usage problem in PyROOT (ROOT 6.24/06).
Do you know if this issue has been fixed? If yes, in which release?
Thanks a lot! Cheers, Alberto
Dear Alberto,
I'm afraid this is not yet fixed, but @Axel-Naumann can confirm.
@etejedor https://github.com/root-project/root/issues/10454#issuecomment-1110193565 suggests a change in PyROOT. Can you confirm whether that helps?
So iiuc the suggestion is to replace this line by something like return TEnum::GetEnum(tn_short.c_str());? Where should TClass::GetClass be used here and for what purpose?
Something along the lines of
if (TClass::GetClass(tn_short.c_str()) return false;
return gInterpreter->ClassInfo_IsEnum(tn_short.c_str());
maybe?
Thanks @Axel-Naumann , so TClass::GetClass is used to discard classes faster instead of just going directly for TEnum::GetEnum? (I assume the second line in your code snippet meant to use TEnum::GetEnum and not the current ClassInfo_IsEnum, which seems to be behind the memory hogging).
Right - TEnum::GetEnum() is doing these things already, unlike ClassInfo_IsEnum(). So just using TEnum::GetEnum() instead of ClassInfo_IsEnum() would be enough!
TEnum::GetEnum() does not seem to suffer from the same issue. https://github.com/root-project/root/pull/11412 replaces the use of ClassInfo_IsEnum() by TEnum::GetEnum() in clingwrapper so there is no memory hogging in PyROOT for this case.
#11412 revealed a test failure caused by the switch to TEnum::GetEnum. This is a reproducer in C++, extracted from the test that fails:
enum EFruit {kApple=78, kBanana=29, kCitrus=34};
void repro()
{
auto type = "std::vector<EFruit>::value_type";
cout << gInterpreter->ClassInfo_IsEnum(type) << endl;
cout << TEnum::GetEnum(type) << endl;
}
which prints
1
0
The different answer between the two calls is what makes the test fail.
Another instance of the issue can be seen at: https://root-forum.cern.ch/t/possible-memory-leakage/54519
in https://github.com/root-project/root/issues/10454#issuecomment-1257779330, the typedef is not being resolved, i.e.
enum EFruit {kApple=78, kBanana=29, kCitrus=34};
void repro()
{
auto type = "std::vector<EFruit>::value_type";
cout << gInterpreter->ClassInfo_IsEnum(type) << endl;
cout << TEnum::GetEnum(type) << endl;
auto resolved = TClassEdit::ResolveTypedef (type, kTRUE);
cout << TEnum::GetEnum(resolved.c_str()) << endl;
}
prints
1
0
0x2048750