spec icon indicating copy to clipboard operation
spec copied to clipboard

[js-api] Conversion from JS values to floats doesn't specify how to handle NaNs

Open Liamolucko opened this issue 2 years ago • 4 comments

The current algorithm within ToWebAssemblyValue to convert from JS values to floats looks like this:

If type is f64,

Let f64 be ? ToNumber(v). Return f64.const f64.

It's very unclear how NaNs should be handled here. ToNumber returns a JS number, which isn't quite the same as a regular f64; it's only defined to have a single NaN value, with an unknown bit pattern. The spec should say something about how that NaN value gets converted, whether it's defined to result in a particular value or just left as implementation-defined.

The actual implementation of this currently differs between browsers. Using this code to test:

;; float-bits.wasm
(module
    (func $f64toi64 (param $float f64) (result i64)
        local.get $float
        i64.reinterpret_f64
    )
    (func $i64tof64 (param $bits i64) (result f64)
        local.get $bits
        f64.reinterpret_i64
    )
    (export "f64toi64" (func $f64toi64))
    (export "i64tof64" (func $i64tof64))
)

const wasm = fetch("./float-bits.wasm");
const { instance } = await WebAssembly.instantiateStreaming(wasm);
const { f64toi64, i64tof64 } = instance.exports;

document.body.textContent = f64toi64(i64tof64(0x7ff8000000001234n)).toString(16);

Chromium preserves the bits and shows 7ff8000000001234, whereas Firefox and Safari pick a single NaN bit pattern of 7ff8000000000000.


In the opposite direction, ToJSValue also doesn't specify how NaNs should be handled:

  1. If w is of the form f64.const f64, return the Number value for f64.

I don't find this as bad, since it's pretty obvious that any NaN should map to the JS NaN, but it wouldn't hurt to add a note about it here too.

Liamolucko avatar Jun 25 '22 06:06 Liamolucko

Thanks for pointing this out, I'll try and take a look soon. I'd assume the ES spec prohibits Chromium's behaviour here, though there's some reference to "implementation distinguishable NaN value[s]"... I don't recall who's worked on this on the JS side.

Ms2ger avatar Jun 27 '22 09:06 Ms2ger

Thanks for this observation. Indeed, the spec does not define the bit pattern of a NaN when the NaN gets passed to WebAssembly from JavaScript.

I see two options, along the lines of https://webassembly.github.io/spec/core/bikeshed/#nan-propagation%E2%91%A0:

  1. ToWebAssemblyValue has to produce a canonical NaN; or
  2. ToWebAssemblyValue has to produce an arithmetic NaN.

The argument for 1) would be that it is more deterministic. However, even within WebAssembly the NaN behavior is not deterministic when patterns like in the example above are used. Also, there are two canonical bit patterns, and not just one. If you have any concrete example why a canonical NaN would be useful, please tell us.

The arguments for 2) would be that a) it's faster if you don't have to canonicalize NaNs. b) In the WebAssembly core spec, canonical NaNs are only required when the input is a canonical NaN. However, as the bit pattern of NaN is not specified in JavaScript, we cannot expect it to be the canonical NaN.

I think 2) is better, because it's faster, seems consistent with the core spec, and I don't see a reason why the NaN handling in the JS-API has to be more strict than the handling within WebAssembly itself.

gahaas avatar Jun 27 '22 12:06 gahaas

@syg perhaps this might be of interest to you too

Ms2ger avatar Jun 27 '22 14:06 Ms2ger

ecma262 does not prohibit Chromium's behavior. There is no normative requirement of NaN canonicalization.

NaN bit patterns are not observable outside of TypedArray writes and reads, so NaNs are indistinguishable when flowing through JS variables. If it flows through a TypedArray write, there's this unhelpful language about "An implementation must always choose the same encoding for each implementation distinguishable NaN value". The normative requirement of that thus hinges entirely on what "implementation distinguishable" means, which, AFAICT, in the limit, means an implementation can choose to distinguish all NaN bit patterns. I'm guessing the intent was something like, don't be too chaotic and start randomizing bit patterns. But still, the TypedArray boundary, and it sounds which is to be expanded to include the WebAssembly boundary now, does permit non-canonicalized NaN bit patterns.

When we last relitigated this, my recollection is basically that we do not force canonicalization, and while we'd like to say something stronger like "bit patterns are preserved through variables", we couldn't figure out how to formulate it without prohibiting implementation techniques and optimizations. For example, suppose we did say exactly that. But then consider let v = F64[0]; F64[1] = v; where F64 is a float64 TypedArray and F64[0] is some NaN bit pattern that's not the canonical NaN. In a NaN-boxing implementation, you now can't canonicalize that NaN and have to special case such data flow, which is pretty undesirable.

I'd argue for 2) out of the two choices @gahaas presented above. JS already has no canonical bit pattern, and the lack thereof is already observable via TypedArray boundaries. It conceptually makes makes sense to me to expand that boundary to include WebAssembly, which deals with linear memories that are exposed as TypedArrays.

syg avatar Jun 27 '22 19:06 syg