kani icon indicating copy to clipboard operation
kani copied to clipboard

Add a unified codegen cache

Open AlexanderPortland opened this issue 4 months ago • 4 comments

I’ve noticed that Kani will often repeatedly codegen elements that are similar or exactly the same (sometimes as many as ~500 times) while compiling a crate. The two biggest offenders being codegen-ing rustc_public::Tys into cbmc::Types and turning rustc_public::Spans into cbmc::Locations.

This PR introduces an extensible system for caching portions of Kani’s codegen, using it to speed up both cases mentioned above.

Results

Testing on the standard library shows this cache maintains a >98% cache hit rate, reducing end-to-end compile times by 20%, all while only taking up only ~8MB of memory at peak size (<1% of Kani’s total average use).

Other Ideas

I also tried to cache how we codegen Rvalues and FnAbis, but both failed. The former was far too dependent on the current state of the program, and I think the latter is already cached by the compiler’s query system, so my cache only made it slower when we had to actually query the compiler.

That being said, I’m sure there are more opportunities for caching in our codegen, so I designed the system from the start to try to make it easy to extend the cache with more kinds of elements.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

AlexanderPortland avatar Aug 21 '25 19:08 AlexanderPortland

Generally looks good to me, but could you please got the extra mile and add a regression test that will measure the caching efficacy (producing basically the numbers that you put in the PR description, but perhaps for some different code base - maybe even Kani itself?). Just like you added end-to-end compile-time measurements it would be good to have caching numbers so that we know when some other, seemingly unrelated, change completely breaks caching.

tautschnig avatar Aug 22 '25 07:08 tautschnig

Hey @tautschnig, would this be best as a CI job (similar to the compiler timing one) or a regression test with a certain cache hit rate hard coded to detect any regression past that? Today's my last day so just trying to plan out what's feasible in the time frame--CI jobs can be very slow to test in my experience. [for my own memory he said a CI job would be best if possible haha]

AlexanderPortland avatar Aug 22 '25 16:08 AlexanderPortland

Hey @tautschnig, would this be best as a CI job (similar to the compiler timing one) or a regression test with a certain cache hit rate hard coded to detect any regression past that? Today's my last day so just trying to plan out what's feasible in the time frame--CI jobs can be very slow to test in my experience. [for my own memory he said a CI job would be best if possible haha]

@AlexanderPortland Please let us know whether you think you could still find the time to do this. Thanks!

tautschnig avatar Sep 09 '25 08:09 tautschnig

Hey @tautschnig! Sorry for the delay, classwork is picking up and I just got sick lol. I did the fxhash change and I'll likely have a free day to sit down and do the CI job too as long as you're okay to wait another few weeks (but I can't make any promises). If it's time critical to have this land, feel free to merge as is!

AlexanderPortland avatar Sep 16 '25 22:09 AlexanderPortland