Document e2e logging performance for time series data
We want to benchmark logging scalars, including setting a timeline value for each logged scalar, i.e. something like
for frame_nr in range(0, 1_000_000) {
rr. set_time_sequence("frame", frame_nr)
rr.log("scalar", rr.TimeSeriesScalar(sin(frame_nr / 1000.0)))
}
We have the tool for it:
just rs-plot-dashboard --num-plots 10 --num-series-per-plot 5 --num-points-per-series 5000 --freq 1000
For each language (C++, Python, Rust), measure the max throughputs (scalars per second), end-to-end (logging -> visualization) for single-threaded/single-plot and multi-threaded logging (so 3 x 2 throughput figures).
We also want to check the memory use in the viewer when we have logged 100M scalars or so, to measure the RAM overhead.
manually document this somewhere in our docs, i.e.:
On a 2023 MacBook M1:
| Language | Single-threaded | Multi-threaded |
|---|---|---|
| C++ | ? kHz | ? kHz |
| Python | ? kHz | ? kHz |
| Rust | ? kHz | ? kHz |
Viewing 100M scalars use up ?GB of RAM in the native viewer.
Very rough numbers is fine, e.g. "~10 M scalars / second"
We should link to https://github.com/rerun-io/rerun/issues/4423 too
I know there was some decision to punt on this (and it was moved to Triage), so I'm moving this down in urgency.
It would be nice with a short comment explaining why we are punting on this though.