hardhat chore: implement additional test cases for the node test reporter

[ ] Because this PR includes a bug fix, relevant tests have been included.
[ ] Because this PR includes a new feature, the change was previously discussed on an Issue or with someone from the team.
[x] I didn't do anything of this.

This PR is part of the initiative to enhance our custom reporter - https://github.com/NomicFoundation/hardhat/issues/5437

This PR modifies the reporter's test runner in the following ways:

adds a --test-only flag (e.g. running pnpm test:integration --test-only example-project would only run the tests for the example-project fixture)
adds flag parsing (only --show-output, --test-only, --color, --no-color are accepted)
ensures the tests are executed relatively to the reporter's root (this is relevant for some traces)
ensures all ms timing signatures are replaced with Xs
ensures all node line:column references are replaces with X:Xs

This PR also adds the following test cases:

hooks (i.e. before, beforeEach, after, afterEach, describe)
- throw an error from within a hook
multiple files
- import a function which fails when executed inside a test case
- import a function which passes when executed inside a test case
nested describes (3 levels)
- fail a test case
- pass a test case
- throw an error with cause inside a test case
test.only
fail a test case
slow test
- pass a slow test case
- fail a slow test case
test.todo
- fail a test case
top level test (without a describe)
- fail a test case
- pass a test case
- throw an error with cause inside a test case

Follow-up Questions

Should we continue to include colour codes in the expected/actual outputs?
- My thinking is that diffs are easier to parse for humans without them, but it comes at a price of not checking the right colouring.
Should we expect test.only tests to be executed by default?
- Executing node --import tsx/esm --test --test-reporter=./dist/src/reporter.js only-test/*.ts --color produces https://github.com/NomicFoundation/hardhat/blob/4cee3401010b74189afd878e05f4e07b6263c458/v-next/hardhat-node-test-reporter/integration-tests/fixture-tests/only-test/result.txt with the test case inside the test.only block executed.
Do we want any special markings for TODO tests?
- Currently, the TODO indication is only present in the summary, see https://github.com/NomicFoundation/hardhat/blob/4cee3401010b74189afd878e05f4e07b6263c458/v-next/hardhat-node-test-reporter/integration-tests/fixture-tests/todo-test/result.txt
Should we have any inline indication if a test throws in the after (3))/describe (11)) block?
- Right now, it is only reported together with the other errors at the end, but there is no indication, which test it relates to, see https://github.com/NomicFoundation/hardhat/blob/4cee3401010b74189afd878e05f4e07b6263c458/v-next/hardhat-node-test-reporter/integration-tests/fixture-tests/hooks-test/result.txt
- These throws also meddle with the summary. In after, we end up with 2 added to the failing count (expected) and an additional 1 added to the failing count to account for the after block failing. In case of describe, it's 2 added to the cancelled as expected and additional 1 added to the failing count.

Aug 23 '24 14:08 galargh

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
hardhat	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 28, 2024 1:28pm

Aug 23 '24 14:08 vercel[bot]

⚠️ No Changeset found

Latest commit: 9fcd5314efad1ea00d4ffbec168917a96e9fce0c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Aug 23 '24 14:08 changeset-bot[bot]

I scheduled sync time with @alcuadrado this week to discuss the follow-up questions that came to me during the course of preparation of this PR. I don't believe any of them should be blocking for this PR through.

Aug 26 '24 14:08 galargh

Should we continue to include colour codes in the expected/actual outputs?

My thinking is that diffs are easier to parse for humans without them, but it comes at a price of not checking the right colouring.

I think, ideally, we include them, but also in some rendered format. I've tried a few solutions and this seems to be the best option:

sudo apt install aha wkhtmltoimage
cat result.txt | aha --black  > result.html
wkhtmltoimage --format svg result.html result.svg

There are some node.js solutions, but they pull puppeteer, which don't seem to work on devcontainer on apple silicon.

Aug 26 '24 22:08 alcuadrado

Great work, @galargh!

I have some comments/questions related the tests and the behavior that we want to achieve in the reporter:

hooks-tests: Can we report that a failure was in an afterEach hook?
hooks-tests: I think the after and before errors look well.
hooks-tests: The tests cancelled due to a before error could probably be improved, now that we know that after, afterEach, and beforeEach don't cancel the test.
hooks-tests: Same for the ones cancelled due to a describe error.
hooks-tests: Can we distinguish what cancelled each test?
multi-test: can we avoid showing the empty file paths? This behavior is also present when you use --test-only.
nested-test: we are missing the case of tests nested inside tests, which IMO is a very unfortunate feature. We should test that path though. Note that a nested it whithin an it needs to be awaited.
only-test: we should probably add another file to that test, with a an ignored describe and it.
onlyt-test: we should probably test it with and without --test-only, so that we can see the two behaviors.
todo-test: we can also add a it.todo withot a callback, as that's the actual todo behavior. It's unexpected that node:test runs the test if you do provide a callback, so our current test is asserting a failure.

Follow-up Questions

See my PR into this PR, I think it's a good trade-off.
Unfortunately, node:test imposes that behavior. I think it's to avoid the typical situation where you push a .only accidentally and your CI magically passes in 10ms. I disagree with that product decision, but it's out of our control.

See the note about the test, though.

Also, we may need a way to customize the message that tells the user to run with --test-only, as Hardhat's CLI is different. Ideally we just pass it as an argument.

I think the test is not treated as a "todo test" because it has a callback. If you remove it, it is shown differently. We should probably test both cases.
Great points. I also covered them in my notes.

Aug 27 '24 20:08 alcuadrado

As we discussed in our call, let's merge this and implement any improvements to the reporter in follow-up PRs.

re the missing test cases: if you want to add them here that's also fine, but may a follow up PR would be easier to review.

That makes sense, thank you for the approve. I'm going to merge it as-is and propose the other work items we discussed in follow-up PRs.

Aug 29 '24 10:08 galargh