wasm-tools icon indicating copy to clipboard operation
wasm-tools copied to clipboard

Strategy for targetting specific WebAssembly specification snapshots

Open nagisa opened this issue 2 years ago • 5 comments

Today we have a couple of WebAssembly specifications – the wasm-core-1 W3C Recommendation is a snapshot from 2019 and then there is a living draft for 1.1 which includes multiple new proposals, including, but not limited to, multi-value, simd, saturating-int-to-float and sign-extension-ops.

Some users (e.g. me) of wasmparser, wasm-smith etc. want to constrain themselves to the more conservative wasm-core-1 snapshot, as can be seen by PRs such as https://github.com/bytecodealliance/wasm-tools/pull/482 or my own https://github.com/bytecodealliance/wasm-tools/pull/525. In my case it is because accidentally allowing some functionality through forces me to maintain it in perpetuity, but also because the VM I maintain currently targets wasm-core-1.

I know that the wasm-tools are largely maintained for the purpose of wasmtime, which may or may not care at all about wasm-core-1. Adopting a more principled strategy here would naturally require more effort from people looking to add implementations for the new proposals. My thinking is still that it would be nice to make wasm-core-1 a first-class target for the tooling provided by this repository. I'm willing to dedicate some of my time to set this up in a way that reduces the overhead as much as possible for future contributors.

First, I could see myself investigating an implementation of a test suite or some such that ensures a validator for wasm-core-1 does not accept instructions or constructs that postdate wasm-core-1, or that wasm-smith and other tools does not deal in these same constructs, where appropriate.

I would also like to discuss adding additional APIs that focus wasm-core-1. For example a WasmFeatures::wasm_core_1 seems like it could be a pretty non-controversial addition.

Would such an effort/direction/strategy make sense to you?

nagisa avatar Mar 28 '22 15:03 nagisa

Currently in my opinion the experience of maintaining all the feature-gated code is pretty small and easy to keep. In that sense I'm ok with keeping feature guards for a pretty long time until they become too onerous (but that hasn't happened yet).

Otherwise though I'd think we'd rather not be in the business of naming wasm snapshots. In some sense that's more of a CG-level question of at what points is the specification "published" and if there's even a concept of snapshots like you're mentioning.

If, however, there's a concept of snapshots at the wasm spec level then I think what you're mentioning here seems pretty reasonable (e.g. WasmFeatures::the_snapshot_name()) and testing might not be too hard as we would ideally just have another submodule checkout of the https://github.com/WebAssembly/testsuite module but at a different commit (or something like that)

alexcrichton avatar Mar 28 '22 17:03 alexcrichton

Happy to add feature flags for proposals wherever we are missing them (thanks for adding the flag for non-trapping conversions to wasm-smith!) as well as expand the test suite to check that we don't accidentally emit code that relies on proposals that are disabled.

As an aside, I think you're kinda swimming against the stream by trying to support a specific published version of the spec, since it is a "live spec" and is updated as new proposals are ready. Both engines and toolchains tend to have --enable-whatever flags for each proposal, and once they are stable enough, they get enabled by default. Trying to do something different for just your engine (rather than just supporting / not supporting various proposals) is likely to cause unnecessary friction for you. At the same time, the CG has talked about defining named "profiles" that describe a set of features users can rely on being implemented together; maybe that is something you'd be interested in in the future.

fitzgen avatar Mar 28 '22 17:03 fitzgen

Otherwise though I'd think we'd rather not be in the business of naming wasm snapshots.

Sorry if I made the impression that it'd be a prerogative of anybody else than CG. By saying wasm-core-1 in the original text I really meant the Version 1.0 of the specification, as “released” by the CG and tagged in the specification repository, and I probably should've referred to it as such.

As an aside, I think you're kinda swimming against the stream by trying to support a specific published version of the spec, since it is a "live spec" and is updated as new proposals are ready. Both engines and toolchains tend to have --enable-whatever flags for each proposal, and once they are stable enough, they get enabled by default. Trying to do something different for just your engine (rather than just supporting / not supporting various proposals) is likely to cause unnecessary friction for you.

Ultimately every specific published version will correspond to a specific set of features, won't it? To the best of my knowledge the WebAssembly Specification version 1.0 corresponds largely to today's wasmparser::WasmFeatures with most of the features disabled. That said, I think the exact approach as to how wasm-tools and the rest of the ecosystem handles evolution of the specification is somewhat orthogonal to the problem I'm looking to solve. So let me try rephrasing the problem statement.

In very general terms, my concern largely lies within lack of confidence in upgrading, for example, the wasmparser dependency. In particular I'm worried that by upgrading the dependency, the wasmparser::Validator will begin accepting new instructions and/or functionality introduced by some proposal that has been just merged or is set to be merged to the living WebAssembly specification.

Historically wasmparser accepting more code than it should is a recurrent problem: blocks with multi-value types without the multi-value feature is a recent example of this (fixed in https://github.com/bytecodealliance/wasm-tools/pull/465). Validator accepting the non-trapping and sign-extending conversions without any feature flags, despite the fact that these were introduced by a proposal, is another (fixed in https://github.com/bytecodealliance/wasm-tools/pull/482). All three of these proposals are reasonably stable today, but the confidence likely wasn't there when the implementation of these instructions/proposals was landed initially, right?

This is why the version 1.0 of the specification is interesting to me. It is naturally not a great Default-default for the Validator, but the fact that corresponds to an approximately empty feature set also gives a good baseline for verifying just how water-tight the implementation of the feature flags is in the first place.

Once we have a water-tight Validator, throwing wasm-smith output at a Validator with wasm-core-1-level features would also readily expose most of the instances where e.g. wasm-smith is incorrectly generating code targeting a recently implemented proposal, etc.

Hope that helps to clarify my thinking.

nagisa avatar Mar 28 '22 19:03 nagisa

I agree with Nick that even what you're mentioning is sort of swimming against the stream. My impression of the CG is that there are few, if any, who think of the spec as having discrete tags as opposed to continuously evolving over time. While there was indeed a specific state for the first publication in practice it doesn't matter too much as engines continue to grow and implement more features.

In general wasmparser has never been good about negative tests of the form you're mentioning, for example "this module should be rejected if this feature is disabled". Additionally wasm-smith is unable to produce a module which is expected to fail validation with a certain feature set. For example wasm-smith, if simd is enabled, may produce a module that fails to validate without simd enabled in the validator, but that isn't guaranteed.

Overall for wasmparser the main goal is to implement relevant wasm proposals and to have wasmparser always in a state of "if the feature were ungated today this is exactly how we'd want wasmparser to look". At the same time we try to gate feature proposals so if it isn't activated no one has access to it. We clearly make mistakes, such as the multi-value issue you found, however.

All that to say, AFAIK the proposal here is something like WasmFeatures::some_method() which seems fine to me. I don't think there's general CG consensus that some_method has a specific name or even exists, though. While there are verisons of published specifications I'm unaware of any engines that work at spec versions as opposed to the level of individual proposals.

alexcrichton avatar Mar 29 '22 15:03 alexcrichton

A relevant discussion is https://github.com/WebAssembly/spec/issues/870. It occurred a long while ago, so I don't know if the perception of the topic has changed since, but it largely reaffirms my impression that the features aren't all independently toggle-able, at least.

nagisa avatar Apr 04 '22 12:04 nagisa