test262 icon indicating copy to clipboard operation
test262 copied to clipboard

INTERPRETING.md should clarify that PASS/FAIL results are per test (file), not scenario (strict, non-strict)

Open linusg opened this issue 2 years ago • 2 comments

The concept of a "test scenario" is not outlined in INTERPRETING.md but common in runners such as https://github.com/bocoup/test262-stream and https://github.com/bterlson/test262-harness, which reports separate PASS and FAIL results per test scenario (approx. twice the test file count, ignoring noStrict and onlyStrict). This is problematic as it can lead to test results that are skewed towards a more positive outcome than what I believe is the intended way of counting test results: a "test" (one file) only passes if, by default, the outcome matches the expectation both in strict and non-strict mode. A hypothetical engine failing all tests in strict mode and passing all in non-strict mode would therefore receive a score of 0%, not 50% of "passing tests".

linusg avatar Jul 15 '23 14:07 linusg

Relevant sections from INTREPRETING.md that allow to infer this conclusion but could be more explicit:

All tests are declared as text files located within this project's test directory.

Unless configured otherwise (via the noStrict, onlyStrict, module, or raw flags), each test must be executed twice: once in ECMAScript's non-strict mode, and again in ECMAScript's strict mode.

By default, tests signal failure by generating an uncaught exception. If execution completes without generating an exception, the test must be interpreted as "passing." Any uncaught exception must be interpreted as test failure.

linusg avatar Jul 15 '23 14:07 linusg

I would certainly not expect partial credit for only implementing one of strict or non-strict, so clarifying the docs seems fine with me.

ljharb avatar Jul 15 '23 15:07 ljharb