packed_simd
packed_simd copied to clipboard
Expand Machine Code Analysis page in the performance guide
This is a tracking issue for expanding upon the machine code analysis part of the perf guide.
To do
- [ ] how to emit assembly from Rust code (e.g.
cargo-asm
) - [ ] how to emit IACA /
llvm-mca
markers with inline assembly - [ ] give an example on how to intepret results, in the context of one of the example benchmarks
References
- IACA homepage
- "What is IACA and how to use it?" on StackOverflow
- TheIronBorn's usage of IACA with
movmsk.c
- LLVM MCA docs
I made a crate last year that makes using IACA slightly easier: https://github.com/hdevalence/iaca-marker-macros