[ntuple] Implement unbuffered parallel writing
Instead of using one RPageSinkBuf per context, implement a synchronizing page sink that compresses pages and writes them through to storage, but only commits them when the context's cluster is ready. This uses much less memory, but results in higher lock contention and very fragmented files.
We likely don't want to merge this because buffered writing offers better scalability and allows to reorder pages, resulting in better read performance. But for future reference, this is how it could be implemented.
Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds
Test Results
9 files 9 suites 1d 16h 47m 35s :stopwatch: 2 634 tests 2 633 :white_check_mark: 0 :zzz: 1 :x: 22 331 runs 22 330 :white_check_mark: 0 :zzz: 1 :x:
For more details on these failures, see this check.
Results for commit 9ad6150b.
:recycle: This comment has been updated with latest results.
Of note, this has a reverse conflict with https://github.com/root-project/root/pull/15239 which currently documents that parallel writing is always buffered
As discussed and mentioned before, we will require buffered writing with the RNTupleParallelWriter because of its better scalability and less fragmented output files.