Proposal: Add type for unreachable states
Problem
The current design of the unreachable instruction (and br and the like) imposes an additional decidability constraint on WebAssembly: for any instruction sequence, we need to be able to determine if there exists a typing for the sequence.
This problem will grow more difficult as WebAssembly grows (until we make it trivial with this proposal). Even just the small complexity introduced by adding reference types has caused a bug in the reference interpreter (WebAssembly/reference-types#111), imposing yet another layer of complexity on the reference interpreter solely for type-checking unreachable code. This will be made worse by br_on_null in the typed references proposal, which will require adding a "unknown reference type" case.
This decidability constraint seems to already be restricting the design of WebAssembly. For example, here in #1365 it is raised as an issue for adding a dup instruction. This is because (unannotated) dup and pick instructions would require you to reason that the duplicated type is used consistently across multiple instructions, requiring the type-checker to track constraints on type variables and ensure they're satisfiable, all for the sake of just type-checking unreachable code.
Short-Term Solution
A short-term solution is to add an unreachable type (it seems specifically to be a result type) to the typing rules but not to the grammar. Any (syntactically valid) instruction sequence would have type unreachable -> unreachable. Because this changes nothing besides validation and accepts strictly more programs, we could make this change to any proposal spec and it would be backwards compatible.
This solution does not seem to have been discussed in the design rationale. It also seems to be compatible with all the desirable properties provided there except for the following:
It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).
However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).
Long-Term Solution
A long-term solution would extend the grammar to account for an unreachable type. For example, a function could declare its type as [i32 i32] -> unreachable. WebAssembly compilers could even use this information to optimize calls to such functions, i.e. no need to save registers before the call if the call will never return! (Though of course you still have to consider exceptions.) This solution would involve many more design decisions, some of which might be informed by other discussions that are in flight, so it would probably be best to explore this option as part of some larger effort rather than as an isolated proposal.
This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping. I explicitly called out that this also axes the bottom type. I'm not surprised that we overlooked an implication like this.
Subtyping and the bottom type are currently reintroduced in the func-ref proposal. But it looks like we have to move it back to reftypes. :(
I agree that at least the short-term solution is desirable (I sketched out the formal rules here).
FWIW, the design rationale you linked does explicitly address this:
Other solutions that have been discussed would lose most of the above properties. For example: |...snip...| Allowing but not type-checking unreachable code would break decomposability and requires the spec to provide a syntactic definition of reachability (which is unlikely to match the definition otherwise used by producers).
I think the concerns about the "syntactic definition of reachability" are overstated. To implement a type checker today, exactly the same set of instructions needs to be handled specially because of their polymorphic output types. I'm also not sure if anyone is relying on decompositionality (I could believe reliance on the converse).
The comparison to the JVM at the end of that rationale doesn't appear to be accurate. The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code. (Quote from the latest JVM standard (link))
It is illegal to have code after an unconditional branch without a stack map frame being provided for it.
So the JVM does not have to deal with any polymorphism issues in dead code. In an earlier version of the JVM (I think prior to the introduction of stack maps), return explicitly set the output type to empty (link).
FWIW, another "design smell" surrounding this - when I implemented a verified type checker for Wasm, easily 90% of the effort both in implementation and in correctness proof was concerned purely with handling the dead code case. This isn't precise but it's a rough indicator for the mental overhead this design decision is causing on type checking, even before typed objects are introduced.
EDIT: missed Andreas' response
This exact type was previously introduced as part of the reference types proposal, for the reasons you mention, and some others (it wasn't needed for the MVP, because the type algebra was simple enough). However, as a bottom type, it relies on subtyping, so it got removed in response to the request to remove subtyping.
I don't think this type should rely on subtyping. It's the type of the whole stack, not the type of an individual element. To allow dead code to skip validation, all you need are the typing rules
C ⊢ unreachable : t* -> ⊥
<similar for return, br, br_table>
C ⊢ e* : ⊥ -> t*
This should work even if t* can contain object types.
My impression of the discussion about subtyping for reference types was that it was centered around a (near-)bottom heap type. I only just learned that there was a bottom value type in the proposal as well. As @conrad-watt points out, the suggestion here is a bottom state/result type, which avoids many of the problems of bottom heap/value types while providing a simpler solution to the same problem that the bottom value type seems to have been solving. (It's also simple enough that its rules can be formulated without explicitly adding subtyping.)
FWIW, the design rationale you linked does explicitly address this:
The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.
The solution here is slightly different because it gives a type to unreachable code. This, for example, makes it still decomposable. So same practical effect, but with slightly nicer theoretical properties.
In the short-term solution, the "type" is just a formal artefact to express skipping validation. If the "skipping validation" approach had been agreed on at an informal level, it would have almost certainly been formalised in this way. I agree that the approaches diverge when considering the long-term solution.
This, for example, makes it still decomposable.
I don't believe the resulting language would be decomposable in the same way that the current language is. I also don't believe this is a big deal :P
Agreed. So do we have insights on reasons to not skip validation that have not been addressed above?
It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).
However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).
I'd like to hear implementers' perspectives on how dead code is currently handled (e.g. in streaming compilation), and whether they'd have any concerns if it was no longer validated. The quoted point (from @RossTate's OP) seems very reasonable.
@conrad-watt:
I think the concerns about the "syntactic definition of reachability" are overstated.
Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.
The analogy in Wasm for the JVM's approach would be to completely forbid syntactically dead code.
The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.
I don't think this type should rely on subtyping.
If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable). Either way, it's a separate, more complex extension. We could add that, too, but it seems less bang for the bucks, in the sense that it won't change engines, merely complicate the spec. Depending on the instruction set, it might not even replace the bottom value type (e.g., if you had pick without a stack type annotation).
FWIW, even on stack types, it would still be a form of subtyping (or something largely equivalent).
In the short-term solution, the "type" is just a formal artefact to express skipping validation.
That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like (unreachable) (i32.const 0) (i64.add) is still rejected.
I'm also not sure if anyone is relying on decompositionality
In general, it means that dead code is still type-checked to the extend that it could be executed. As I mentioned elsewhere, that e.g. addressed potential security concerns.
likely by simply not compiling unreachable code (which addresses the attack vector mentioned here)
Among other things, that work-around doesn't apply to interpreters.
Not really. We actually had lengthy and heated discussions about the details of that at the time. For example, whether it should extend beyond a block's end. There is no canonical definition. At the same time, an impedance mismatch in the notion of unreachable would get in the way of codegen.
My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce ⊥. I don't think there would be the same heated debate again. Deliberately or not, WebAssembly has effectively already adopted a definition of syntactic deadness through its non-deterministic output types.
If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).
I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped. This also confines the effects of ⊥ to within the current block, which seems the obvious choice.
Note that with @RossTate's extended proposal, which I'm not arguing for, block types could explicitly declare their output type as ⊥.
That's not correct. The bottom value type does not "skip validation", it merely makes it more permissible. In particular, something like
(unreachable) (i32.const 0) (i64.add)is still rejected.
See above (also, in this proposal, ⊥ is the type of the whole stack, not of a single stack element - it's not the same as the bottom type from the earlier reference types draft).
Among other things, that work-around doesn't apply to interpreters.
I remember you saying that WebAssembly shouldn't be designed for interpreters! In any case, the code still has to be parsed; a security-conscious interpreter would have options to harden itself. But this all seems like a hypothetical.
EDIT:
The relevant aspect of the analogy is that the JVM does not forbid dead code, and that's pretty important for producers. It merely requires it to be type-annotated.
I was a little off with my analogy - I should have focussed on the point that, on the consumer side, the JVM has never had to handle this kind of polymorphic typing of dead code. I can almost believe that forbidding dead code could be annoying for some producers, but do they care about it being validated? Maybe Wasm could have been designed with input type annotations for dead code from the start, but we're now constrained by webcompat.
My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.
My impression from @rossberg's first post and from @conrad-watt is that neither of y'all have objections to at least the short-term solution, so I'd like to hear directly from others about any objections they have.
@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution. I hope he finds my arguments convincing, and I'm very interested to hear from implementers, but we shouldn't put words in his mouth.
Ah, I see how I misread it. Thanks for correcting me.
Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code, often with its interaction with new proposals (e.g. multi-value).
I've heard similar things from people who've hacked on spidermonkey's other wasm validator; hopefully those folks can speak up here as well.
The short term proposal (where we limit its effects to within a block) sounds like a great idea to me.
Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution, so the short-term solution would not require any meaningful changes to most of Binaryen and would allow some nice simplifications in some of Binaryen. It would also help reduce the cognitive overhead of working with Binaryen IR by making its unreachable handling more similar to Wasm's unreachable handling.
Binaryen has always had an explicit unreachable type in its IR that behaves almost like the proposed long-term solution
Is that true? I was under the impression that binaryen's unreachable was quite a bit more complex than described here.
I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.
It took me a bit to realize how this works. For others who were confused like me: the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e*), and yield any number of result types (t*). This would allow it to match the expected results for that block.
EDIT: actually, ⊥ must be the entire type stack, see Conrad's comment below.
often with its interaction with new proposals (e.g. multi-value).
Agreed, I've found more cases where unreachability is complex in the newer proposals too (e.g. GC) when the annotations are removed. I believe this is essentially the same problem as select.
the idea is that if ⊥ is on top of the type stack during validation, then this rule can consume any number of expressions (e*), and yield any number of result types (t*)
Just to be 100% precise, with this proposal, a type stack can be either t* or ⊥, so ⊥ isn't just the top of the type stack, it's the whole thing. We wouldn't have to handle anything like t : ⊥ or ⊥ : t.
EDIT:
Just to note again, without @RossTate's extended proposal, ⊥ wouldn't actually show up in any program type annotations. It's purely a formal contrivance to represent validation being skipped.
Is that true? I was under the impression that binaryen's
unreachablewas quite a bit more complex than described here.
Agree there are probably mismatches (no worse than currently). For example I think binaryen can label the output of an infinite loop as unreachable independent of its explicit type annotation.
Binaryen's unreachable is similar to the long term solution in that blocks, loops, and other control flow structures can be unreachable. The complexity comes from Binaryen's rules about when they cannot be unreachable, e.g. when they are branch targets. ~So actually an infinite loop cannot be unreachable in Binaryen because it is necessarily a branch target.~ (Edit: this is true for blocks but not loops, so infinite loops do end up being unreachable. Complicated!) But those details aren't so important here.
Instructions in Binaryen that are not control flow structures and not unreachable polymorphic can have unreachable type only when they have unreachable-typed children. So Binaryen's i32.add instruction can have types (i32, i32) -> i32, (unreachable, i32) -> unreachable, (i32, unreachable) -> unreachable, or (unreachable, unreachable) -> unreachable. Under those rules, instructions that take no arguments cannot be unreachable, so for example f32.const only ever has type () -> f32. With this proposal, we would have to allow instructions with no arguments to be unreachable as well, for example letting f32.const have type () -> unreachable. But that's a trivial change that should be mostly mechanical.
So Binaryen's
i32.addinstruction can have types(i32, i32) -> i32,(unreachable, i32) -> unreachable,(i32, unreachable) -> unreachable, or(unreachable, unreachable) -> unreachable
That sounds a little different, because you're using unreachable as the type of a single stack element, rather than as the type of the whole stack (see my last comment above). In any case, it's not crucial that Binaryen's internals match the Wasm spec types precisely (considering they already don't!).
Given that we have no "unreachable" type, and blocks have to have types, the only places I can think of where someone is type-checking with an unknown is between a branch and an "end", correct? Or did I miss something?
I believe that's correct (albeit I'm implicitly refining "branch" to unreachable/br/throw and the like).
@conrad-watt:
My point is that all implementations have converged around a set of instructions which, at validation time, must produce PolyStack [] as an output type. It is precisely these instructions that would be considered "syntactically dead", and now produce ⊥.
You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.
If you want to keep checking dead code, then a simple bottom stack type would not be enough, since there can be known types on the top of the stack (it becomes more like a row variable).
I addressed this specifically with the additional typing rule C ⊢ e* : ⊥ -> t*. Generating a ⊥ type as output allows remaining instructions in a straight line to be skipped.
Note how I specifically said "If you want to keep checking dead code", which this would not be addressing.
I remember you saying that WebAssembly shouldn't be designed for interpreters!
Well, the context probably was that its design doesn't need to be optimised for interpreters, which was true for long (although I think @titzer changed his mind about that recently). But security obviously is a different consideration.
@rossberg's last comment was explicitly an objection to (his understanding of) the short-term solution.
Right, I was initially misreading that as the bottom value type solution, which is just a minor tweak to what all engines are implementing already. The bottom result type is a more fundamental and likely more controversial change.
But just to be clear, I'm just conveying the discussions we had. The all-code-must-be-valid argument isn't mine (even though I'm somewhat sympathetic). IIRC, it was several engine folks who pushed it most strongly at the time, but I don't remember the details.
@RossTate's OP:
This [the short-term] solution does not seem to have been discussed in the design rationale.
It is a version of the don't-check-dead-code solution (for one specific definition of dead code), which is discussed in the doc, and was originally discussed and rejected by implementers, see above. Whether that has changed I don't know.
It also seems to be compatible with all the desirable properties provided there except for the following:
It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).
However, this property seems to already not hold for even the original WebAssembly. In particular, select requires type-directed compilation, and without the relevant type information there is no guidance as to how it should be compiled. So WebAssembly compilers need to reason about unreachability already, likely by simply not compiling unreachable code (which addresses the attack vector mentioned here).
You are talking about the implementation details of consumers, while the statement you cite explicitly talks about semantics and producers targeting that semantics. (A consumer OTOH has many reasons to care about reachability, but will typically use a completely different definition, e.g., when it's decoding directly into SSA form. As for type-directed compilation, that's a not an issue, it can just propagate bottom to an arbitrary type without otherwise caring.)
@fitzgen:
Probably ~half the tricky fuzz bugs we've found in the wasmparser crate's validator (which powers wasmtime, cranelift, and aarch64 spidermonkey) and in cranelift's wasm-to-ir translator involve unreachable code
That's not surprising, since this is the path that will be least well-explored by regular tests. The question is what this implies. Arguing against extra safety checks solely on the basis that they are harder to test is a rather problematic proposition.
Having implemented dead code validation in multiple engines and having participated in the history of this unfortunate topic from the beginning, I'll just heave a sigh here first. There were loud voices that called for banning unreachable code and other loud voices going the other way. We went back and forth in multiple iterations, and it seems like it keeps popping up at every new addition to the type system. It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.
The union of all proposals in flight is actually nontrivial to do in a clean, fast way.
Let me just offer this perspective for how I've done this in the past. I'm not proposing any specific to change to the formalism.
Pretty much all fast validators track unreachability as a bit in the control stack--i.e. syntactically reachable or not, for the current control block. A natural way to write this is to hide unreachability behind the "pop" operation. There are actually three pop operations:
- popping a value with an expected type from the value stack
- popping a polymorphic type from the value stack with specific constraint, e.g. that it is a reference type or function reference type
- popping a value of any type from the stack (drop)
After an instruction that ends control flow (e.g. throw, unreachable, return_call), the value stack is emptied back to the start of the current control block and marked unreachable. Any pop operation beyond the top of the current control block would normally be an error, except if the control is marked unreachable, and then it might succeed, depending on the situation above.
There are some wrinkles, and it actually turns out that you cannot hide unreachability behind the pop operation always, particularly because of some missing type annotations in the function references proposal.
- in the function references proposal,
call.refshould pop a function reference off the stack. If the stack is empty and unreachable, then this succeeds, but the stack continues to be unreachable and no new value types are pushed. Because there is no type annotation forcall.ref, there is no known type to be pushed, so nothing is. - In contrast,
func.binddoes have an output type annotation and always pushes a funcref of that type, regardless of whether the control stack was reachable or not.
There are other special cases.
Overall the lack of type annotations in various places makes the current set of in-flight proposals kind of baroque.
As for interpreters, my position has, as politicians like to say, "evolved". I'd summarize it as: Wasm MVP was not designed for interpreters but it was designed for fast validation and was simple enough that modulo control flow, it can be dealt with in an interpreter fairly OK. To preserve some of the good properties that it does have, we mostly just have to avoid mistakes. We don't need to hyper optimize for interpreters, just avoid egregiously bad choices.
@titzer thanks for this! The unreachable-handling approach you're describing is what gave me confidence that there would be fewer debates over the definition of syntactic deadness this time around.
It'll probably never be clean to do and despite our devotion to the formalism, implementors will continue to just wing it and sort it out via test cases in the spec suite. So suffice is to say, having a really simple mental model for it and a great test suite will be very valuable in a concrete way.
The union of all proposals in flight is actually nontrivial to do in a clean, fast way.
At the risk of this comment being referenced in a future post-mortem, I do strongly believe that the block-local unreachable change would significantly simplify the mental model. There may have been a paralyzing number of alternatives when the language was initially designed, but due to current webcompat constraints our conversation can (maybe accidentally) be much more focussed.
with reference to @rossberg...
You may assume that with hindsight, but all I can say is that there was no agreement on this at the time, and different definitions were proposed by different folks. The current, more canonical semantics actually served as a kind of forcing function for the solution you see now.
I agree that the current type checking conventions are mainly an emergent property of the formal design. It would be good to get perspectives from current engine implementers about appetite for this change. I'm not sure who to tag - maybe @lukewagner and @gahaas?
@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.
IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g. (unreachable) (f32.const 0) (f32.const 0) (i32.add) would pass validation.
@conrad-watt If I read your proposal right, that would just mean that push() is a no-op if the top of the control stack is marked as unreachable, correct? If so, then it would be a slight simplification of some of the cases.
Not just push, but also pop. So there would no longer need to be any special polymorphic handling for dead code.
A revised algorithm could always work with a concrete type stack. When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.
EDIT: it is still necessary to decode and check certain syntactic constraints on the skipped instructions, see (https://github.com/WebAssembly/design/issues/1379#issuecomment-694430267)
IIRC one argument against this that was that obvious type errors in unreachable code would be suppressed, e.g.
(unreachable) (f32.const 0) (f32.const 0) (i32.add)would pass validation.
Yes, this would happen (although such code would still have to be parsed).
@titzer And to clarify, it's not the "top of the control stack" that is typed as unreachable; it is the entire control stack that is typed as unreachable.
@RossTate: I think @titzer's control stack is referring to nested control blocks - with one type/value stack associated with each entry on the "control stack". So the top of the control stack is marked unreachable (the innermost block being typed) - which is equivalent to declaring the bottom of the current type stack to be polymorphic/unconstrained.
In the revised algorithm, this unreachable bit on the control stack wouldn't be needed at all (except potentially to implement the "skip-to-end" handling).
When "an instruction that ends control flow" (ref. your post) is encountered, validation empties/discards the current control block's type stack, skips to the end of the block, pushes the block's output type annotation, then continues.
We discussed such an alternative too. That has the downside that facts about immediates (e.g. that indices are in bounds) are not checked. That means you need to have two modes to the validation algorithm; one that skips immediates (i.e. basically a bytecode iterator) and one that does not. This is because you can't find the end bytecode without properly decoding everything in-between.
Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations. Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.
Hiding everything in the various pop operations has the distinct advantage that the core logic for each bytecode generally doesn't have to do anything special, except for the special cases I listed above, which are all due to missing type annotations.
The type annotations are only "missing" to the extent that they facilitate the current typing approach. They wouldn't be necessary in the JVM because it never has to deal with polymorphic stacks. Similarly, they wouldn't be necessary in this proposal. So another way of looking at it is that the current approach causes overhead in type annotation.
Having unreachability spreading into the decoding logic for immediates or another mode for iterating bytecodes seems like adding complexity.
There's a complexity trade-off - but I think the balance is heavily on the side of this proposal. You can't encapsulate the current typing logic purely in pop operations because of select, which needs to be special-cased (and the obvious solutions cause the representation of the type stack to become more complicated).
It's possible to factor the "skipping bytecode" logic so that it's clear what the "normal case" for each bytecode is.
The type annotations are only "missing" to the extent that they facilitate the current typing approach.
Well, type annotations are like violence; if they aren't working, it's because you're not using enough of them :)
But seriously, all of the hard cases that make the pop approach tricky have to do with instructions that are polymorphic or have had type annotations removed. And the current answer always seems to be that polymorphic instructions, even only mildly polymorphic instructions, like ref.as_non_null, are essentially not validated when unreachable. That bothers me, but mode-switching the validator implementation bothers me even more--unreachable checks just spread everywhere, or you end up with a skipToEnd() function, and both seem like regressions to me.