Parallel ingestion
Redesign ingestion with dedicated ingestion worker
Addresses #4298 by moving CPU-intensive Arrow message processing off the UI thread.
Key changes:
-
Ingestion Worker (crates/viewer/re_viewer/src/ingestion_worker.rs):
- Dedicated background thread for processing Arrow messages into chunks
- Bounded channel (capacity: 2000) provides natural backpressure
-
Viewer Integration (crates/viewer/re_viewer/src/app.rs):
- Arrow messages submitted to worker queue instead of blocking UI thread
- Poll worker output during update() and add chunks to EntityDb
Can you please rebase your PR? Seems it's based on a rather old commit from main. Notably arrow has been updated since.
Can you please rebase your PR? Seems it's based on a rather old commit from
main. Notably arrow has been updated since.
Done!
Can you please rebase your PR? Seems it's based on a rather old commit from
main. Notably arrow has been updated since.Done!
I'm still seeing your fork's main being quite a few commits behind ours.
Mea culpa! Should be good now.
Also, I'm available on Discord if you have any questions.
I'm observing a fairly large reduction of the "time to fully loaded" with a 2h air traffic data RRD: 30s -> 9s.
main
https://github.com/user-attachments/assets/48c36fc3-9152-40ae-8c43-52114dcde426
this PR
https://github.com/user-attachments/assets/9204d6c6-aeb5-4a80-9f6d-b0c08ef94176
Based on the above benchmark, could you provide a more thorough analysis of the performance profile of this PR with Puffin?
You should be able to produce the rrd using pixi run -e examples air_traffic_data --dataset 2h --save output.rrd (I can also send mine if needed).
I'm working on it!
This is what the old ingestion performance looks like, using the floating metrics window from PR #11820 against the main branch.
https://github.com/user-attachments/assets/212cc794-db54-43d1-bb36-dc056d98dd5a
Time elapsed is ~65s to load the air traffic 2h data set. Throughput is 2017msg/s.
And this is what new ingestion performance looks like, using the same floating metrics window, adapted to parallel processing and included in this PR.
https://github.com/user-attachments/assets/559afd09-16bf-4f16-b0dc-de6e3f0e6ef7
You can see that ingestion time went down to ~12s and throughput went up >5x.
@abey79 This confirms your findings from yesterday. It also confirms that performance did not decrease with the redesign.
Before:
App ──> knows about workers
└──> manually iterates EntityDbs
└──> calls poll_worker_output()
└──> processes results
After:
App ──> store_hub.on_frame_start()
└──> store_bundle.on_frame_start()
└──> entity_db.on_frame_start() (for each)
└──> worker.poll_processed_chunks() ✅ Encapsulated!
The metrics are key to the whole thing, in my opinion, as is performance tracking. I don't know how to tell if the PR provides true performance enhancements without metrics!
I kept the metrics as the HEAD so I could drop them from this PR and create another one on top of this same branch.
The metrics are key to the whole thing, in my opinion, as is performance tracking. I don't know how to tell if the PR provides true performance enhancements without metrics!
I kept the metrics as the
HEADso I could drop them from this PR and create another one on top of this same branch.
I'm not saying metrics are not useful, and it's great to have them to document the performance increase. But I cannot review this PR, let alone land it, unless the diff only covers the actual ingestion part. Using git, you should be able to rebase the metrics UI stuff on top of this branch, such that you can use it locally without it appearing here.
Cut the head off, please check!
Should be ready to go!