opentelemetry-rust
opentelemetry-rust copied to clipboard
Proposal: Alter RNG for TraceId and SpanId
Base on doing some cursory benchmarking I was seeing that in
main
roughtly 28% of the time spent on Span::start()
is 20% in trace_id generation and 8% in span_id generation,
I originally dug into this because of #800 out of interest in how much time sampling versus non sampling took.
I can run some additional benchmarks to ensure consistency. However, I am currently thinking something like https://crates.io/crates/oorandom would be a faster default. It's #[no_std]
and also #[forbid(unsafe_code)]
Finally it's a PCG so statistically it should be ok.
Different RNG does not necessarily require a different crate -- the rand crate comes with several types of RNG. The first question is if we need a cryptographically secure RNG for this, which we do seem to be using here. If we think we don't need a secure RNG, we could try the SmallRng
, as explained in https://docs.rs/rand/latest/rand/rngs/index.html#our-generators.
Different RNG does not necessarily require a different crate -- the rand crate comes with several types of RNG. The first question is if we need a cryptographically secure RNG for this, which we do seem to be using here. If we think we don't need a secure RNG, we could try the
SmallRng
, as explained in https://docs.rs/rand/latest/rand/rngs/index.html#our-generators.
Agreed as first change SmallRng might be alright. I'll do some initial testing.
From my initial benchmarks looks like minor but not insignificant improvement. ~9% improvement ran a few times to attempt normalizing it.
@hdost, curious if you're still planning to move forward with this change?
Yes I can rebase put out a patch.
@djc So I put out patch #1106 I looked into SmallRng, which did provide some speed increases, max 9% however, it's less portable than the Pcg Rng that I used in #1106 and the results are quite good as well.
Finally got back to this, and when running the stress test for the SmallRng we're seeing about a 6% improvement at the best, to negligible at worst.
Using Existing Rng
Number of threads: 6
Throughput: 6,093,600 iterations/sec
Throughput: 6,099,200 iterations/sec
Throughput: 6,141,400 iterations/sec
Throughput: 6,004,200 iterations/sec
Using SmallRng
Number of threads: 6
Throughput: 6,448,000 iterations/sec
Throughput: 6,407,800 iterations/sec
Throughput: 6,429,200 iterations/sec
Throughput: 6,338,800 iterations/sec