wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

Initial support for value section parsing

Open primoly opened this issue 9 months ago • 4 comments

This draft adds support for component value sections to wasmparser. https://github.com/WebAssembly/component-model/pull/336

Questions, issues and remarks:

The validator currently does not support validation of the actual values:

The parsing of values depends on the component type space, so a value section can’t be parsed in isolation https://github.com/WebAssembly/component-model/issues/352 How should this be handled? Currently, ComponentValue only parses the type and keeps the bytes of the val. To read the actual value the val method takes a slice of ComponentTypes which must be in the order they were read from the current components ComponentTypeSectionReader.

wasmparser duplicates the types in validator and reader. Currently values.rs uses the ones from reader. Should it use types::Types from validator instead?

Validation only needs to call val, as the parser then checks if the types of values are correct.

Crates depending on wasmparser (such as wasmprinter) need to keep track of the component type space. wasmprinter already keeps some global module/component information in the state variable. wasmparser itself does that in its validator.

Parsing of flags that set labels past 64 to true currently fails. Needs support for arbitrarily large unsigned LEB128.

How should flags be stored in wasmparser? Explainer.md: list of labels of fields set to true. For wasmparser I think this will likely confuse people into thinking the u32s are just bools encoded as integers. So I instead chose Vec<bool>, which would treat it as a special record with only bool fields (it already is). More symmetrical and ergonomic.

Should Record, Flags, Variant and Enum store the label names? They are already provided in the ComponentType. Flags could then also be a Vec<&str>.

BinaryReaderErrors use the offset reader.original_position(). Might not always be correct (off by one or couple bytes).

primoly avatar May 09 '24 13:05 primoly