criterion.rs
criterion.rs copied to clipboard
Eliminate memory layout bias when measuring (with LLVM stabilizer)
I'm sorry this isn't a better issue, I haven't really had time to do research on the subject and I'm quite tired at the moment - that said,
I've recently learned just how much memory layout can impact performance of programs. I recommend watching this video from 10:00 on: https://youtu.be/r-TLSBdHe1A?t=598 (the link is already timestamped)
The paper mentioned in the talk (about the effects of memory layout) can be found here: https://users.cs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf
Stabilizer can be found here: https://github.com/ccurtsinger/stabilizer
This might be wildly out of scope for criterion, but I was wondering if, 1) this was common knowledge and 2) if there's anything that's already done to mitigate that bias when measuring performance with criterion.rs.
This is not really an issue, more of a question - but a "fix" could be as simple as adding a bit to the documentation that warns users about this (especially since it seems criterion compares with previous runs?) and points them to tools they can use to mitigate that.
I haven't tried running criterion benchmarks through stabilizer yet, in fact I haven't tried anything at all yet, I'm just opening this before I forget, and because it's very cool (and scary - but cool).
Yes, I've seen that talk as well. I also hadn't realized that memory layout was so unstable. Unfortunately, as you say, this is indeed out of scope for Criterion.rs.
Stabilizer breaks down to three main aspects:
- Adding a random offset to the stack. This could potentially be done in Criterion but it would require some custom C code or something.
- Randomizing the heap layout. This might be possible in Rust but it would require a custom global allocator crate.
- Randomizing the code layout. This is not possible without compiler support (stabilizer itself implements a custom LLVM pass).
Changing the stack offset is non-trivial, but probably doable. Not sure how useful that is by itself. The heap thing, I don't even know where to start. For the code layout, one would have to implement it in rustc
before I could even try.
Wouldn't randomizing stack offset relatively be easy, e.g. by using https://docs.rs/stackalloc/latest/stackalloc/ or https://crates.io/crates/alloca before a test?
I regularly run into caching issues, which make my measurements hard to use.
I wasn't aware of those crates, but yeah, those should make it pretty easy to add random stack offsets. I'll reopen this to investigate whether that would be useful, but I'd consider it to be wishlist-priority.