root
root copied to clipboard
Make sure bulk reading works with prefetching
This is currently not the case when the branch has more than 1 basket.
This PR fixes #8962.
@pcanal, this makes things work for me, albeit I am not sure it's the correct thing to do. I think the real issue is https://github.com/root-project/root/blob/master/tree/tree/src/TBranch.cxx#L1413 but I couldn't quite understand why / how to fix it.
@ktf (I can reproduce the problem now). Commit b2c6904bf458d5149e3955881d0054ba256a4110 seems to have broken the expectation of only one basket in flight, I also wonder how it interact with the interface, in particular, it is clear that this PR removes the crashs but does it lead to correct data being read through the bulk interface?
Test Results
17 files 17 suites 3d 10h 37m 15s :stopwatch: 2 714 tests 2 707 :white_check_mark: 0 :zzz: 7 :x: 43 528 runs 43 521 :white_check_mark: 0 :zzz: 7 :x:
For more details on these failures, see this check.
Results for commit 75a6cbfb.
:recycle: This comment has been updated with latest results.
Some tests are failing with Error: Value cannot be null. (Parameter 'ContainerId') not sure if it is my fault...
@pcanal Indeed, this still breaks my (more elaborate) integration tests...
Nevermind, I forgot to rebuild something and this does indeed break ABI compatibility because of the newly introduced vector..
@ktf Can you test the additional commit that I pushed here? This is another API breaking change (new function).
Thanks. I am doing so right now. I should have something in an hour or so.
This seems to leak somewhere. I have:
where it used to be flat.
This seems to leak somewhere.
Yep, I can reproduce it .... investigating.
I pushed a fix to the memory leak.
Unfortunately the Windows is related. I am checking.
On the other hand, it seems to have fixed the leak and memory usage is back to expected.
@ktf @dpiparo The only problem left is a problem with the test itself. When prefetching is enabled requires more than 2Gb of memory and thus fails on 32 bit platforms. I.e @ktf you can use the PR as-is if need be, it shall be merge soon.
Actually, this is fix but might still not be doing what is meant :( .... and I am not sure if you actually need to use this features.
What do you intend on gain by calling SetClusterPrefetch?
The feature enabled by SetClusterPrefetch is to load all the basket of the cluster in memory so that within a cluster you can have cheap random to the entries (instead of having to decompress again and again).
(At least) there is optimization in place that actually counter to this and need to be removed (the optimization avoids a memory copy by sending the uncompressed buffer back to the user as is ... but then it is no longer there.
And in essence the fix we have here is also incorrect :(. When the 'ClusterPrefetching' is on, we actually should always leave the basket as is in the list of basket for the next call to possibly use (at least until the end of the cluster).
So you could indeed proceed with using this as it function (return the right result) but does not yet implement the ClusterPrefetching optimization (i.e. does not do what it is supposed to do), so you could also just as well turn it off (temporarily).
I was enabling SetClusterPrefetch as part of the attempt to reduce read_calls when processing our AODs.
Indeed I now notice that it's enough to simply do:
// Was affected by https://github.com/root-project/root/issues/8962
// Re-enabling this seems to cut the number of IOPS in half
tree->SetCacheSize(25000000);
//tree->SetClusterPrefetch(true);
for (auto& reader : mBranchReaders) {
tree->AddBranchToCache(reader->branch());
}
tree->StopCacheLearningPhase();
to obtain the same result, so I am fine to simply disable it for now. Do I understand correctly that I still need this patch, though, in case there is more than one basket?
I was enabling SetClusterPrefetch as part of the attempt to reduce read_calls when processing our AODs.
Apriori it does not intend have an effect on that.
Indeed I now notice that it's enough to simply do:
What is the change (increase of the cache size or explicit cache learning or both)?
What is the change (increase of the cache size or explicit cache learning or both)?
Doing the caching at all. I thought prefetching was part of it, but apparently it is not.
I thought prefetching was part of it, but apparently it is not.
The names are really confusing ... sorry.
SetCacheSizeenables/extend theTTreeCache, which 'real' job is to actually prefetch (early grab from disk) the compressed data. This is the tuning that control the size of the read from diskSetClusterPrefetchenables the early decompression of the baskets of the current cluster (whose compressed data is already in memory if used in conjunction with theTTreeCache). This affects performance only in conjunction with non-sequential use/load/read of the entries.gEnv->SetValue("TFile.AsyncPrefetching", 1), in conjunction with aTFileCacheRead(for example theTTreeCache) will asynchronously grab early the compressed data of the 'next' cluster while the current cluster is being processed (i.e. is subject ofGetEntry)
This later setting might be of interest in your case.
Ok, thank you for your explanations. I can confirm that 1 works already for me. I will try 3. 2 I probably do not need, actually.