Record and Replay Support for Component Model
Brief
This PR is intended to support deterministic record and replay (RR) of Wasm components intrinsically in Wasmtime, and received an initial round of discussion in the Wasmtime bi-weekly meeting on 07/17
Motivation
RR is a very useful primitive for improving the debugging story of Wasm in Wasmtime. Bugs that are often encountered in modules during deployment can be deterministically reproduced. In particular, it provides the foundation for the following (to name a few):
- Reverse-execution (a.k.a, time-travel) debugging
- Offline static/dynamic analysis of prior module executions
- Profiling of module/runtime components
- Automatic extraction of differential unit-tests for system interfaces (e.g. WASI)
- Interposition points for targeted fuzzing of system interfaces and/or modules
Scope
This initial PR provides the base primitives for recording and replay events. It supports RR at all import function boundaries and lowering rules for component types. The RR event infrastructure is intended to be easily extensible to new event types as new use-cases emerge.
Primary Goals
- Enabling low overhead (memory, compute, and trace size) recording and high-performance replay,
- Full determinism during replay that can run outside the embedder context.
- An engine-agnostic trace recording format -- the goal is to purely capture guest-host boundary crossings (import calls and component model interactions) that can be reasonably interpreted by another component model compliant engine.
Non-Goals (Subject to Discussion)
- A human readable trace format. This belong better in something like wit-bindgen, and/or as an independent tool over the low-level trace
- Replay support for updated versions of a recorded module -- this requires a much more coordinated effort from the producers as well to make this practically useful.
Initial Performance Numbers
Some initial runs on compression libraries like zstd show a 4-5% overhead on recording logic, excluding the disk I/O. This seems reasonable at the moment and likely doesn't need further optimization unless there are explicit use-cases.
Minor Todo
The following (minor) additions will be made in the coming days prior to potential merging:
- Encoding hashes of modules in the recorded trace for validation
- Generic writers/readers for
RecordBufferandReplayBuffer - Feature gating all of RR
Questions for Maintainers
- Do we get wasip1 for free by recording/replay at this level?
- What's typically the most idiomatic way to serialize anyhow:Errors?
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to complete this check list:
-
[ ] If you added a new
Configmethod, you wrote extensive documentation for it.Our documentation should be of the following form:
Short, simple summary sentence. More details. These details can be multiple paragraphs. There should be information about not just the method, but its parameters and results as well. Is this method fallible? If so, when can it return an error? Can this method panic? If so, when does it panic? # Example Optional example here. -
[ ] If you added a new
Configmethod, or modified an existing one, you ensured that this configuration is exercised by the fuzz targets.For example, if you expose a new strategy for allocating the next instance slot inside the pooling allocator, you should ensure that at least one of our fuzz targets exercises that new strategy.
Often, all that is required of you is to ensure that there is a knob for this configuration option in
wasmtime_fuzzing::Config(or one of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this configuration. See our docs on fuzzing for more details.
-
[ ] If you are enabling a configuration option by default, make sure that it has been fuzzed for at least two weeks before turning it on by default.
To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.
To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.
We'll want at least @alexcrichton to give this a review as well, maybe after we try some of the refactors mentioned here.
Happy to help out! (and agreed I'd like to once-over at some point)
How about scheduling a call when y'all are ready with the 3 of us? That'd probably be best to draw attention to any various areas and for me to ask some questoins in a high-bandwidth way before going off to review on my own.
How about scheduling a call when y'all are ready with the 3 of us? That'd probably be best to draw attention to any various areas and for me to ask some questoins in a high-bandwidth way before going off to review on my own.
That'd be great! FWIW, I'm on PTO next week and the week after; please feel free to talk with Arjun directly before then if you both want, or I can join after Aug 11...
@alexcrichton @fitzgen I'll take a pass through the comments early next week, and perhaps we can find a time later next week that works
Sounds good! Feel free to ping me on Zulip when ready