rust-playground
rust-playground copied to clipboard
Add caching to prevent recompilation of the same data
Maybe also use the cache information to denote if the displayed output tab(s) are out-of sync with the current code / options.
Hi @shepmaster,
Does the Playground implement any kind of caching today, or does all compilation start from a clean state in the Docker image? I have not looked into the code yet, just searched it briefly and found SANDBOX_CACHE_TIME_TO_LIVE. I also see a SandboxCache in server_axum.rs.
Adding a cache keyed by (compiler-version, user-code) would make the first execution fast for users, assuming most of them compile and execute the same few Hello World-style programs.
Going further, it would perhaps be useful to cache the entire target/ folder so that editing the code would be fast?
Basically, I'm imagining a flow like this:
- When asked to compile and execute code, check if we cached output for this
(compiler-version, user-code)pair.- If cached: return the cached output or previous compiler errors.
- Not cached:
- Find the last
target/folder for the user (use a session cookie to identify the user). - Use this to seed the sandbox.
- Run the compilation like normal, trusting Rust to recompile what is necessary.
- Cache the output and the
target/folder.
- Find the last
Such a system would of course require potentially huge amounts of storage, which would need to be managed carefully. But it sounds like it would a way for users to transparently get fast results if they edit code starting from an already cached version — almost as fast as normal local incremental compilation. This of course assumes that it's very fast to copy the target/ folders around, so they should probably be stored locally.
any kind of caching today, or does all compilation start from a clean state in the Docker image
Depends what "caching" encompasses. We build all the dependencies and bundle that as part of the Docker image, so we start from a clean state and cache that aspect.
found
SANDBOX_CACHE_TIME_TO_LIVE. I also see aSandboxCacheinserver_axum.rs.
Those are for metadata extracted from the containers — the names & versions of available crates and the versions of the toolchains & tools.
Adding a cache keyed by
(compiler-version, user-code)
And debug / release. And edition. Maybe some other things.
return the cached output or previous compiler errors.
The specific thing that is cached needs to be carefully planned out. For example, you don't want fn main() { dbg!(random()); } to always return 4 or current_time() to return yesterday because someone ran it once.
If we had something at this level, I'd want it to be in the frontend code and be surfaced to the end user. Something like "this code was previously executed and has been cached, click here to re-run it"
so they should probably be stored locally.
This will place a burden on the system architecture, as that would move us from a stateless to stateful system. If a request wasn't routed to the exact same backend, the local cache would be missing and the request would be slow again.
The idea that I've been mulling around in my head is to add a more interactive / long-running paradigm. My pie-in-the-sky thoughts:
- Frontend establishes a WebSocket connection to the backend.
- Backend starts a Docker container (or six? {stable, beta, nightly, miri, rustfmt, clippy})
- We send a lot of the same structural commands to set the code, start a build, etc.
The target directory would be preserved for the duration of the session, so rebuilds would be "free". We'd also gain the ability to have interactive programs, as we could pipe stdin and stdout to some amount.
We'd need to manage clients that hang around too long to prevent having thousands of Docker containers doing nothing.
almost as fast as normal local incremental compilation
I think there's a buried assumption here that would be good to prove out first. If you write hello world locally, compile and execute it, then change the code to something different, do you get any performance improvements if you don't blow away the target directory?
I suppose my question comes down to: how effective is incremental compilation? How effective is it for the types of programs people use the playground for?
Hi @shepmaster, thanks for all the replies! Great questions there :)
I think there's a buried assumption here that would be good to prove out first. If you write hello world locally, compile and execute it, then change the code to something different, do you get any performance improvements if you don't blow away the
targetdirectory?
That is definitely something to test. I tried doing this with hyperfine by running a Hello World program with
fn main() {
println!("Hello, world 1234");
}
The numbers are there so that I can change the code at random using
% sed -i "s/[0-9]\+/$RANDOM/" src/main.rs
With that setup, we can compare cargo run after a change to main.rs with cargo run after cargo clean:
% hyperfine --prepare 'sed -i "s/[0-9]\+/$RANDOM/" src/main.rs' 'cargo build' \
--prepare 'cargo clean' 'cargo build'
Benchmark #1: cargo build
Time (mean ± σ): 328.9 ms ± 1.6 ms [User: 260.8 ms, System: 77.8 ms]
Range (min … max): 325.9 ms … 331.2 ms 10 runs
Benchmark #2: cargo build
Time (mean ± σ): 706.4 ms ± 6.5 ms [User: 584.5 ms, System: 145.8 ms]
Range (min … max): 697.1 ms … 719.2 ms 10 runs
Summary
'cargo build' ran
2.15 ± 0.02 times faster than 'cargo build'
So yeah, it is faster to do an incremental build on such a simple project.
Doing the same, but this time with main.rs being
fn main() {
println!("{}", textwrap::fill("Hello, world 1234", 10));
}
I see the times being 0.5 seconds for an incremental build and 7.8 seconds for a clean build:
% hyperfine --prepare 'sed -i "s/[0-9]\+/$RANDOM/" src/main.rs' 'cargo build' \
--prepare 'cargo clean' 'cargo build'
Benchmark #1: cargo build
Time (mean ± σ): 454.7 ms ± 6.3 ms [User: 352.7 ms, System: 134.2 ms]
Range (min … max): 444.6 ms … 466.4 ms 10 runs
Benchmark #2: cargo build
Time (mean ± σ): 7.884 s ± 0.106 s [User: 18.082 s, System: 1.419 s]
Range (min … max): 7.781 s … 8.132 s 10 runs
Summary
'cargo build' ran
17.34 ± 0.34 times faster than 'cargo build'
Depends what "caching" encompasses. We build all the dependencies and bundle that as part of the Docker image, so we start from a clean state and cache that aspect.
Are the dependencies here the 100 top crates which are available on the playground? I could unfortunately not quite follow the Dockerfile logic.
Some quick testing suggests that incremental builds already happen: testing the code above with Textwrap shows that it compiles in ~1 second, not 8 seconds like on my machine.
and 7.8 seconds for a clean build:
This isn't quite an accurate comparison as it's going to rebuild textwrap and all of its dependencies. You'll likely even see Cargo printing out that it's compiling those dependencies. The playground would not rebuild the crates, only the user-submitted code. If it didn't, every request to the playground would take 10-20 minutes to complete 😉
You may wish to try doing something similar to your target directory idea:
- Add
textwrapas a dependency - Add an empty
lib.rs cargo build- Remove
lib.rs - Save the
targetdirectory rm -rf target- Restore the
targetdirectory - Update
main.rs cargo build
Steps 6-9 are what you'd want to test in hyperfine.
Are the dependencies here the 100 top crates which are available on the playground
It's closer to 250+.
I do find your first case interesting. I wouldn't have thought that the overhead of setting up the target directory for a hello world to be that large.
This isn't quite an accurate comparison as it's going to rebuild
textwrapand all of its dependencies. You'll likely even see Cargo printing out that it's compiling those dependencies. The playground would not rebuild the crates, only the user-submitted code. If it didn't, every request to the playground would take 10-20 minutes to complete :wink:
Yeah, that makes a lot of sense now! Especially give the number of precompiled dependencies present on the playground.
I tried testing the difference in speed between
- compile
main.rsusing atarget/folder for an emptylib.rsand - compile
main.rsusing atarget/folder already used for amain.rs
I think that is the comparison you're after, right?
So I created a target.precompiled folder from an empty lib.rs file (with no main.rs file at all). I then put my main.rs file back, looking like above. With this I can compare the two scenarios
% hyperfine --prepare 'cargo clean; cp -a target.precompiled target' 'cargo build' \
--prepare 'sed -i "s/[0-9]\+/$RANDOM/" src/main.rs' 'cargo build'
Benchmark #1: cargo build
Time (mean ± σ): 476.1 ms ± 13.3 ms [User: 724.7 ms, System: 160.5 ms]
Range (min … max): 464.9 ms … 507.9 ms 10 runs
Benchmark #2: cargo build
Time (mean ± σ): 346.9 ms ± 3.6 ms [User: 270.1 ms, System: 105.7 ms]
Range (min … max): 343.1 ms … 352.4 ms 10 runs
Summary
'cargo build' ran
1.37 ± 0.04 times faster than 'cargo build'
I wondered if the time difference was because the lib.rs file has "vanished" and turned into a main.rs file, so I repeated the experiment. This time I precompiled the Textwrap dependencies with a main.rs file containing just fn main() {}. The results were the same.
I believe we see the effect of the incremental compilation here. The actual call to textwrap::fill triggers code paths which are not triggered by the empty lib.rs or main.rs files. This makes sense: textwrap::fill has a generic type parameter, so the compiler has to do more work when it sees the actual call.
To verify this, I tried using textwrap::indent instead: a simpler function with no generics. In that case the compilation times based on the target.precompiled directory is the same as the "incremental" compilation time:
% hyperfine --prepare 'cargo clean; cp -a target.precompiled target; cp main.rs src/' 'cargo build' \
--prepare 'sed -i "s/[0-9]\+/$RANDOM/" src/main.rs' 'cargo build'
Benchmark #1: cargo build
Time (mean ± σ): 279.8 ms ± 3.1 ms [User: 221.6 ms, System: 76.1 ms]
Range (min … max): 276.3 ms … 287.7 ms 10 runs
Benchmark #2: cargo build
Time (mean ± σ): 280.2 ms ± 5.7 ms [User: 221.8 ms, System: 69.2 ms]
Range (min … max): 275.4 ms … 292.0 ms 10 runs
Summary
'cargo build' ran
1.00 ± 0.02 times faster than 'cargo build'
In conclusion, I think the caching strategy you use is nearly perfect!
To make it better, it would be necessary to instantiate the generic type parameters of various functions. Perhaps this could be done by running the unit tests of every dependency? However, it's not clear to me if this can be done in a nice way so that the compiled files all end up in the same target/ directory.
Oh, one more thing. I checked if there was a time difference between compiling
fn main() {}
and
fn main() {
println!("Hello world");
}
and there isn't. I believe this means that the generic types used by the printing machinery is already instantiated by the empty program — perhaps by the code which prints stack traces.
I think it could still be interesting to include a few "typical" constructs in the precompiled target/ folder for the playground. As an example, I compared compiling the empty main.rs vs
fn main() {
let mut m = std::collections::HashMap::new();
m.insert("Hello, world", 1234);
println!("m: {:?}", m);
}
There I can measure a difference of about 20%: it takes 370 ms to compile the code starting from empty, but only 300 ms when done incrementally. I saw a smaller improvement when letting main.rs use a Vec<i32>: there the time went from 300 ms to 270 ms.
So perhaps the empty lib.rs file should be replaced with a simple file which uses a few common stdlib types.