runtime
runtime copied to clipboard
ExplicitConversion_FromSingle failing due to NaN != NaN
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=705150 Build error leg or test failing: System.Tests.HalfTests.ExplicitConversion_FromSingle Pull request: https://github.com/dotnet/runtime/pull/103306
Error Message
Fill the error message using step by step known issues guidance.
{
"ErrorMessage": "ExplicitConversion_FromSingle",
"ErrorPattern": "",
"BuildRetry": false,
"ExcludeConsoleLog": false
}
Known issue validation
Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=705150
Error message validated: [ExplicitConversion_FromSingle]
Result validation: :white_check_mark: Known issue matched with the provided build.
Validation performed at: 6/12/2024 3:13:29 PM UTC
[14:40:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[14:40:21] info: Assert.Equal() Failure: Values differ
[14:40:21] info: Expected: NaN
[14:40:21] info: Actual: NaN
[14:40:21] info: at System.AssertExtensions.Equal(Half expected, Half actual)
[14:40:21] info: at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[14:40:21] info: at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[14:40:21] info: at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[14:40:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[14:40:21] info: Assert.Equal() Failure: Values differ
[14:40:21] info: Expected: NaN
[14:40:21] info: Actual: NaN
[14:40:21] info: at System.AssertExtensions.Equal(Half expected, Half actual)
[14:40:21] info: at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[14:40:21] info: at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[14:40:21] info: at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
Report
Summary
| 24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
|---|---|---|
| 2 | 23 | 40 |
Tagging subscribers to this area: @dotnet/area-infrastructure-libraries See info in area-owners.md if you want to be subscribed.
Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.
This issue is matching to tests jobs which log test names. Need to further constrain the match.
Just hit this in #106040 for WasmTestOnChrome-ST-System.Runtime.Tests config: net9.0-browser-Release-wasm-Mono_Release-WasmTestOnChrome
[18:27:21] info: [STRT] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: Assert.Equal() Failure: Values differ
[18:27:21] info: Expected: NaN
[18:27:21] info: Actual: NaN
[18:27:21] info: at System.AssertExtensions.Equal(Half expected, Half actual)
[18:27:21] info: at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[18:27:21] info: at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[18:27:21] info: at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[18:27:21] info: [STRT] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: Assert.Equal() Failure: Values differ
[18:27:21] info: Expected: NaN
[18:27:21] info: Actual: NaN
[18:27:21] info: at System.AssertExtensions.Equal(Half expected, Half actual)
[18:27:21] info: at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[18:27:21] info: at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[18:27:21] info: at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
Not sure why it didn't match in build analysis
The test here should likely be filtered out on WASM for the time being, with a tracking issue against it.
WASM isn't technically doing anything wrong here, normalizing NaN is fully allowed by the IEEE 754 spec. However, it is undesirable and not recommended for most cases and should typically be avoided if possible.
WASM also has instructions that should guarantee the underlying bits are preserved, so it should be possible for it to be preserved and match the behavior of other targets (both for RyuJIT and Mono).
Hm the weird thing here is that this seems to be an intermittent failure. I've checked and we definitely have passing runs of the test on the exact same config.
Is there perhaps a difference in the WASM version or configuration options for the failing platform as compared to others?
Maybe some machine specific issue that's only triggering in some scenarios?
the tests are running in containers on the same helix queue so even the VM/machine should be the same...
Passing run: https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-106013-merge-6f4bade3cf974edcb8/WasmTestOnChrome-ST-System.Runtime.Tests/1/console.3fb2056a.log?helixlogtype=result
Failing run: https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-106040-merge-0b77eabb49f848cab5/WasmTestOnChrome-ST-System.Runtime.Tests/1/console.5edeb661.log?helixlogtype=result
The only difference is the order of the tests, but I'm having a hard time imagining how that could be relevant
~~hm one interesting thing is that the failing run reports two tests that failed i.e. five total, but the passing run only ran four NaN tests...~~ that seems just a log quirk, the testResult.xml only contains four NaN tests and two of them failing.
According to Kusto test data this started happening on or before 2024-06-06. I also found that it's happening on Windows-based containers too so it's not related to the underlying OS.
Given it's only happening intermittently and doesn't appear to be a blocker for 9.0 I'm moving it to 10.0
Couple of things could be responsible which might be explained by the test order:
- jiterp might be kicking in
- browser JIT might be kicking in
But I'd like to push back against hte idea that bitwise comparison of two NaNs is a good idea. there are many NaN bitpatterns; all are semantically indistinguishable. Reasonable code should not depend on the bit pattern staying identical after computation
But I'd like to push back against hte idea that bitwise comparison of two NaNs is a good idea. there are many NaN bitpatterns; all are semantically indistinguishable. Reasonable code should not depend on the bit pattern staying identical after computation
This is a core thing frequently depended upon for SIMD, NaN boxing, etc. It also follows the IEEE 754 "recommended" guidelines, even if it's not strictly required by the IEEE 754 spec.
The WASM 1.0 and 2.0 specs similarly:
- distinguish
canonical NaNfromarithmetic NaN(a nan with payloadn) - provide explicit APIs for getting the bits of any
NaNfbits(+/-nan(n))is defined asfsign(+/-) * 1^expon(n) * ibits(sinif(n))- which is to say, its doing
BitConverter.SingleToInt32BitsorBitConverter.DoubleToInt64Bitsor vice-versa
- recommend that operators propagate
NaNpayloads (matching the IEEE 754 recommendation) - provides explicit support for specifying the payload of a
NaN
However, the WASM specs do notably explicitly allow non-determinism for fneg, fabs, and fcopysign; explicitly specifying that the sign and payload of the result are non-deterministic
So while a given implementation is technically allowed to do "other things" (such as always canonicalizing NaN), its highly irregular to do so and that's why we have tests that validate such conversions are preserving bits and propagating payloads as expected. -- It's also notably cheaper to propagate and to preserve NaN as is then it is to canonicalize (particularly for things like fbits), so there's really no reason for a browser to deviate here
@tannergooding @lambdageek what are the next steps here ?
Should the milestone be 10? This is hitting 9.0 PRs too. Example:
- 9.0 PR: https://github.com/dotnet/runtime/pull/108326
- Job: https://dev.azure.com/dnceng-public/public/_build/results?buildId=839877&view=logs&j=d4e38924-13a0-58bd-9074-6a4810543e7c&t=7a581d54-fbdb-5b9d-4377-cc346f489112&l=104
- Log: https://helixre107v0xd1eu3ibi6ka.blob.core.windows.net/dotnet-runtime-refs-pull-108326-merge-06a2d595e69e44cd8b/WasmTestOnV8-ST-System.Runtime.Tests/1/console.46ef387c.log?helixlogtype=result