fizzy
fizzy copied to clipboard
Parser and validation design
Levels of parsing / decoding / validation
- "Unsafe" parser. It assumes the wasm module must be valid and happily reads it without checkout out-of-buffer access. Providing invalid module can crash the parser.
- "???" parser. It assumes the wasm module must be valid but have additional check preventing invalid memory access. Providing invalid module will produce invalid but deterministic execution results. The parser never crashes. Hopefully, this one is not much slower than the "unsafe" one.
- Validator. It parses the wasm module and also fully validates it.
What we currently have is mixture of 0 and 2 with partial validation.
Design
- wasm spec allows lazy validation of functions (executing invalid function traps). So the good option is to implement the parser for wasm this way - functions are validated before being called for the first time.
- wasm spec has "Verification Algorithm": https://webassembly.github.io/spec/core/appendix/algorithm.html (Thanks to Paul for the reference).
- The goal is to receive internal module representation needed for execution. Not needed information should be discarded during parsing.
- Validation can be a template param for Parser type (separating code for reader/parser and validator seems impractical).
#include <cstdint>
enum class ValType : uint8_t
{
i32 = 0x7f,
i64 = 0x7e,
};
template <bool Validate>
struct Parser
{
ValType parseValType(uint8_t b)
{
if constexpr (Validate)
{
if (b != 0x7f && b != 0x7e)
throw "invalid type";
}
return (b == 0x7e) ? ValType::i64 : ValType::i32;
}
};
The algorithm seems to validate the code sections, but nothing else really.
@chfast @gumb0 can we close this?
Some parts are not implemented.