design icon indicating copy to clipboard operation
design copied to clipboard

Should const expressions declare their length in the binary format?

Open sbc100 opened this issue 3 years ago • 13 comments

Const expressions are like functions in that they are defined as sequences of instructions. However, unlike functions, they do not declare their own length but are instead defined to end only if/when an End opcode is found. The downside of this is that parsers cannot skip over them and must decode the instruction stream. For some tools (such as llvm) is can be useful to be able to parse wasm files without doing a full instruction decode (or indeed without the ability to decode instructions at all).

I noticed this issue while working on extended const expressions in various tools such as llvm and wabt. In llvm. for example, we use the wasm format also as the object file format and the object reading doesn't want to care about instruction decoding (for now I've had to teach it enough to handle just the constant expressions instructions).

I understand that changing this could be tricky/impossible/costly, but I thought it worth discussion.

sbc100 avatar Mar 10 '22 21:03 sbc100

Over in https://github.com/WebAssembly/feature-detection/issues/6#issuecomment-1064843246 the prospect is also raised that const expressions may need to participate in conditional-feature expansion; there's possibly a tie-in with the present issue, wherein const expressions become richer / have some kind of optional prefix.

lars-t-hansen avatar Mar 11 '22 07:03 lars-t-hansen

Changing this for the existing format may be too much, but it's certainly not too late for extended-const, and I think it's worth talking about. On the LLVM issue I wondered whether the cost of an extra byte per expression would be worthwhile for the benefit, and wondered what the use case is for knowing the size of each individual constexpr (as opposed to the size of the whole section, which you can already skip over). LLVM's wasm object file decoder is an example, where it keeps a pointer to the body of each init expr, but does not attempt to validate or decode the instructions.

dschuff avatar Mar 14 '22 15:03 dschuff

I believe we even discussed this in the early days. IIRC, the argument was that for functions, this is meant to support parallel decoding/compilation, and that isn't really useful for constants: (1) they tend to be too small and (2) in many cases their value needs to be known anyway to process the surrounding construct.

The GC proposal may affect this reasoning to some degree for global and elem initialisers, but to a degree making a change worthwhile?

rossberg avatar Mar 15 '22 10:03 rossberg

It seems possible to do this in a backwards compatible pay-as-you-go way: In the current binary format all 3 places where init expressions are use are currently preceded by a flag set. Globals have flags which current just encode IS_MUTABLE, and data and element segments have flags that currently encode IS_PASSIVE. These could both get an EXTENDED_INIT flag to signal the use of extended cont in their initializer (which could have length prefix). A single binary could then avoid the length prefix for trivial initializers... thus avoiding the cost for those not wanting to use fancier init expressions.

Is it worth it to be able to skip over instruction sequences rather than decode them? I think it could be one day, especially if we keep extending the set of valid sequences.

sbc100 avatar Mar 15 '22 15:03 sbc100

@rossberg What about a future where we allow initializing whole (acyclic) graphs of objects in const initializers? This would be very useful for, e.g. Virgil, where an initial heap is baked into the binary. What Virgil does when targeting the JVM is generate bytecode that allocates the heap (with an unfortunate limit of the JVM method size, 64KB of bytecode). It'd do basically the same for Wasm GC, except that allowing that code to be in global initializers would allow the engine to cache/copy the graph (with memcpy or similar), rather than interpreting the instructions. Similarly, static compilers for Wasm like wasmtime could generate an initialized data segment with pickled objects.

@sbc100 At first I didn't like the idea of another flag changing the encoding of the following bytes, but we've done this enough times (and with the above reasoning for initialized graphs) that I think it might make sense.

titzer avatar Mar 15 '22 21:03 titzer

@titzer, that's an interesting idea, but wouldn't it be much too limited in practice without the ability to represent cyclic object graphs? Those are omnipresent. And it's not clear how we can ever support those in initialisers.

So you'd have to defer tying recursive knots to some startup code and ideally, somehow take the snapshot afterwards (e.g., after executing the start functions). But if that's the case anyway, complex initialiser expressions perhaps don't buy you much?

rossberg avatar Mar 17 '22 07:03 rossberg

Sure, you can't denote cyclic data structures purely in the initializer and have to fix them, e.g. in the start function. But it's a worthwhile tradeoff because those back links will generally be few relative to the thousands or even millions of objects that might be in the initial heap. An engine can still mmap those MAP_PRIVATE and patching them will only cost a dirty page, or this will happen automatically if a static compiler emits them as a regular data section.

titzer avatar Mar 17 '22 07:03 titzer

FWIW, I am skeptical about the usefulness of memcpy or mmap for this, since the target isn't an empty heap, i.e., you have to relocate all pointers, which may easily be half the words in the image or more.

rossberg avatar Mar 17 '22 08:03 rossberg

Relocation is still cheaper than allocating the objects one-by-one and then writing their fields one-by-one, even if the majority of the heap were reference fields, which is of course very application-dependent. You don't necessarily need relocation for a static compiler; you would locate the initialized data at a fixed address. E.g. Virgil native binaries have static addresses for all their sections and need no relocation.

titzer avatar Mar 17 '22 08:03 titzer

You don't necessarily need relocation for a static compiler; you would locate the initialized data at a fixed address.

Not sure I follow. AFAICS, when compiling a Wasm module, this entire optimisation only is beneficial when the module is expected to be instantiated multiple times. And then the target addresses can't possibly be fixed?

Maybe I'm misunderstanding what you mean be static compiler. Are you talking about a scenario where Wasm is compiled to native code off-line and then cached between entire runs?

rossberg avatar Mar 17 '22 08:03 rossberg

Are you talking about a scenario where Wasm is compiled to native code off-line and then cached between entire runs?

Yes.

titzer avatar Mar 17 '22 13:03 titzer

Another binary format option that came up during the hybrid CG meeting was to not use a flag bit, but to use a (perhaps illegal) instruction. For example, the unreachable (== 0) instruction as the first byte in an initializer could signal a length is to come next. Coming back to this thread though, I see that @sbc100 mentioned that a flags field is in every location preceding an initializer. If that's the case, I think I would prefer a bit.

titzer avatar Oct 31 '22 15:10 titzer

Another conclusion from the recent meeting was that we should include in the extended const proposal so avoid having variable length const expressions without this flag/bit.

I've re-opened https://github.com/WebAssembly/extended-const/issues/9 to address this.

sbc100 avatar Oct 31 '22 19:10 sbc100