test_runner: support expecting a test-case to fail
This PR adds support for an xfail directive (similar to skip and todo), whereby a user can flag a test-case as expected to fail.
Review requested:
- [ ] @nodejs/test_runner
There is long-established precedent for xfail:
- pytest. See also XFail: mark test functions as expected to fail and How and why I use
pytest'sxfailpytest.xfailis distinct frompytest.fail
- Mozilla
- Jest, which calls it
failing, though the original proposal used the termxfail. - DejaGnu Testing Framework
- The Greg testing framework
- See also:
What is the commitment level and timeline for this? I ask because I've already implemented xfail, ypass and some other additional feature extensions to node:test in a yet-to-be-released node-test-extra package (like fs-extra is to node:fs). I'd be delighted if xfail was subsumed into node:test. Because I've put a lot of thought into it, I might have something to contribute to nodejs on this. I could write up my thoughts, collaborate with @JakobJingleheimer, and/or create an alternate PR.
lmk, please
Ah, sure. I'm not particular about the name as long as it's intuitive (which xfail seems to be).
I expect this could land pretty quickly: quite a few collaborators had previously voiced support for it, and to my knowledge there have been no general concerns raised against the feature itself.
Aside from the name (which is trivial to update in the PR), the remaining work is:
- [x] find where the result data gets passed to reporters (and add this property to that—the actual implementation work there is probably ~5 seconds)
- [x] update the docs
Creating competing PRs is never desirable and generally pisses off the person / people already working on it.
Happy to collaborate on the feature; but aside from influencing the name, I think this is very straightforward and has very little implication work, especially almost none remaining.
In terms of timeline to finish this, I expect to have time this week (the past ~2 weeks were very tumultuous for me).
Additional features that my version has:
-
option to specify the expected error. I think this is really important. If for example you are documenting/reproducing/tracking via a test a bug that is in released code, you will want it to xfail, since it ought not fail CI, but only if it fails in the expected way.
-
command line option to treat xfails as regular tests, i.e. treating them as test failures. This is useful when you want to see the failure details, which most reporters only show for failed tests. With
node-test-extra, it will also show all debug/console logging, which is otherwise suppressed.- I am considering augmenting that command line option with a filter arg, so you can filter which xfails are treated as regular tests.
I think that's it, though I'll have to take a look at
Alex and I considered those and pushed them to v2 to get the base implementation landed sooner (many users probably won't need the v2 extras, so no need to make them wait for things they don't want/need). If you already have an implementation for that, I'm happy to defer to you for that once this lands 🙂
Deferring to v2 makes sense, but I think you should quickly consider how it would be added to the API, just to avoid potential API thrashing or getting painted into a corner. In the implementation, you have a string value given for xfail that is treated as a reporting message, just like todo. That makes sense:
test('this should to that', { xfail: 'bug #382' }, (t) => {
//...
});
How would you, in v2, specify the expected failure? An xfail-error property that specifies the error message or pattern? Something that can be parsed out of the existing xfail property? Or within the test function, e.g. via a new assert.xfail method that specifies both the expected value and also the expected erroneous value? Or...?
Arguably a new kind of assert might be better than adding an xfail marker as a test property.
assert.xfail(expected, actual, expectedErroneous?, message?)
OTOH, assertions of failures maybe should be reserved for testing that the code throws the correct exception under specific conditions.
That's contrary to the intention of this feature: the test should be identical to the non-failing version. This is flag to flip.
assert.xfail is a different feature 🙂
you have a string value given for
xfailthat is treated as a reporting message, just liketodo. That makes sense:test('this should to that', { xfail: 'bug #382' }, (t) => { //... });
We were thinking in the case of it.fail, any failure is okay.
In the case of { fail } it would accept, true, a string or regex to compare against err.message (and maybe an instance of Error/AssertionError).
For all of these, there is no API thrashing as they're all truthy / falsy, which makes them additive to the current.
it all sounds good to me. I'm glad this is getting into node:test. Thank you!
As to reporting, I think you have three options:
-
xfailis a special case ofpass. If you want existing reporters that don't know about xfail to report it as a regular pass. -
xfailis a special case oftodo. If you want existing reporters that don't know about xfail to report it as a todo. this makes conceptual sense since it is something that shouldn't be treated as "all green", something that should be highlighted as needing to be addressed. -
you don't worry about breaking existing reporters, and you want xfail to be reported distinctly from pass and todo.
In my implementation, I chose option 2. A failing xfail generates a test todo event, where the todo message is prefixed with "xfail:". A passing xfail generates a test fail event, with an error indicating that it was an xfail that unexpectedly passed.
This allows my xfail aware reporter to produce a more informative report (it formats xfail results differently from both pass and todo), while degrading gracefully for legacy reporters.
I (and I think Alex too?) was imagining the result of it.xfail would go into the existing pass (when it throws) or fail (when it doesn't throw) and would additionally have a property expectedFail: true (not married to the prop name either—if a well known one exists that isn't crazy, happy to go with that, especially if reports will already support it).
Reporters that support the extra prop do so, and those that don't just behave "normally" (even the uninformed report is still correct).
ok, that's what i meant by option 1. Since I've implemented an xfail-aware custom reporter, it's all good for my purposes.
Huzzah! TAP reporter is now consuming/surfacing xfail. The others aren't yet because I can't find where they compose their output text
The others aren't yet because I can't find where they compose their output text
Jacob, both spec and dot reporters rely on: https://github.com/nodejs/node/blob/37d9cfcd3ae5d1857838937d3403d16d227c05ba/lib/internal/test_runner/reporter/utils.js#L71
I know this as I spent a lot of time trying to reverse engineer the semantics of TestsStream, which isn't documented nearly enough for people to write custom reporters without inferring semantics by reading all of the built-in reporter code. I will open a discussion or issue on that topic when I get the chance.
But as part of my reverse engineering, I ended up paying off some test_runner tech debt in precisely this part of the code: #59700. It's seems to be falling through the cracks in PR limbo. Maybe we can get that merged? It isn't strictly necessary for your PR, but since we'll have review eyes on this area of code, it would be efficient to do both at the same time?
Jacob, both spec and dot reporters rely on: https://github.com/nodejs/node/blob/37d9cfcd3ae5d1857838937d3403d16d227c05ba/lib/internal/test_runner/reporter/utils.js#L71
Huzzah, yes, that's it. Thanks!
Codecov Report
:x: Patch coverage is 87.50000% with 3 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 88.53%. Comparing base (bd3a202) to head (241e5dd).
:warning: Report is 204 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| lib/internal/test_runner/reporter/utils.js | 33.33% | 1 Missing and 1 partial :warning: |
| lib/internal/test_runner/tests_stream.js | 75.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #60669 +/- ##
==========================================
- Coverage 88.55% 88.53% -0.02%
==========================================
Files 703 703
Lines 208077 208449 +372
Branches 40083 40210 +127
==========================================
+ Hits 184254 184548 +294
- Misses 15841 15905 +64
- Partials 7982 7996 +14
| Files with missing lines | Coverage Δ | |
|---|---|---|
| lib/internal/test_runner/harness.js | 91.82% <100.00%> (ø) |
|
| lib/internal/test_runner/reporter/tap.js | 98.23% <100.00%> (+0.01%) |
:arrow_up: |
| lib/internal/test_runner/test.js | 97.36% <100.00%> (-0.01%) |
:arrow_down: |
| lib/internal/test_runner/tests_stream.js | 89.65% <75.00%> (-0.35%) |
:arrow_down: |
| lib/internal/test_runner/reporter/utils.js | 90.38% <33.33%> (-2.76%) |
:arrow_down: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
PR is quasi-blocked by a bug in the Validate commit message workflow (pending fix: https://github.com/nodejs/core-validate-commit/pull/130).