Integrate LLVM Code Coverage Instrumentation
Is your feature request related to a problem? Please describe.
Add LLVM Code Coverage support to Rusty (for use with tools like llvm-cov), so that instrumentation is added to track program execution.
Another developer @HaithemLamri and I are interested in making this happen. We understand that this feature would be pretty complex to implement and not immediately useful for most users, so we’re mostly looking for higher-level support as to how and where this feature could best be integrated in Rusty.
From preliminary research, there seem to be two options for going about this:
- Implement the coverage spec from scratch, using LLVM primitive datatypes:
- Less future-proof, as the implementation will need to be updated as LLVM updates their code coverage specification (though this historically has not changed very frequently).
- This appears to be immediately possibly with only
Inkwellprimitives, without requiring changes toInkwellorllvm_sys.
- Integrate using LLVM’s
CoverageMappingclasses:- This would really simplify the code coverage data type generation, since it abstracts a lot of the generation.
- It appears that the Rust compiler team tied into these classes for Rust’s integration of LLVM code coverage.
- The primary issue with this is that these classes/modules don’t appear to be implemented in
Inkwellor inllvm_sys(whichInkwellcalls), and would likely have to be integrated into both underlying libraries, before being added to Rusty. This seems like a major challenge.
From an initial assessment, the first option seems to be the quickest and easiest to implement barebones functionality… though the second would likely be better for maintainability in the long run.
Additional Resources
- Clang’s code coverage documentation
- Rust’s integration
- Integrate at AST level
- This PR was eventually rejected, in favor of the below PR.
- Integrate at MIR level
- “MIR” (Mid-level IR) appears to be Rust-specific. Rusty only has the AST (as far as I understand).
- Integrate at AST level
- LLVM Code Coverage Generation Modules (not supported in
Inkwellorllvm-sys)- CoverageMappingGen.cpp
- CoverageMappingWriter.cpp
- InstrProf.h: contains useful name mappings
- InstrProfData.inc: contains useful defines
This looks like a great idea.
the second strategy you suggest by using the LLVM classes might not be out of the question.
Inkwell exposes all the LLVM types using either a LLVMReference trait and the internal-getters feature or a as_mut_ptr method on elements (depending on the version, not sure which version rusty is on, might be the LLVMReference one still, but it might be worth updating)
https://github.com/TheDan64/inkwell/pull/392
Also note that Inkwell and llvm-sys only wrap the llvm C api, if something is not in the C api you will need to get a raw handle on the specific objects and branch into the CPP classes.
I'd be happy to review some pull requests with this. I would suggest smaller PRs with that don't break functionality might be the easiest to review. A bigger PR might take a while.
@ghaith thank you for pointing that out about the as_mut_ptr methods.... I didn't realize that! I'll take a look and see if that makes it possible to access everything we would need.
I like the idea of using the CoverageMappingGen classes if possible, since the LLVM IR that clang generates has some more complexity than what's in the specification. It would likely avoid some headaches down the road.