ava
ava copied to clipboard
Log test duration for serial tests only
I'm thinking of revisiting #971, but with a twist.
Reporting test duration when tests run concurrently can be misleading, since slow, synchronous code in one test also slows down another. This is why I've been hesitant of exposing the duration numbers in the TAP reporter.
Oddly though we do show durations in the verbose reporter! This has survived all the way from #30.
I wonder if we should only show durations when --serial is used. This way the numbers will be more accurate and users will be less tempted to fix one "slow" test while the problem is another. We can then also include it in the TAP output.
👍
Hello @novemberborn I am interested in working on this issue.
My question is, how should we format the output?
TAP version 13
# passing
ok 1 - passing 30 ms
1..1
# tests 1
# pass 1
# fail 0
is it okay?
@okyantoro yay!
I don't know — the spec is at http://testanything.org/tap-version-13-specification.html.
This may require a PR to https://www.npmjs.com/package/supertap which we use to format the TAP output.
@novemberborn that spec didn't mention about time duration but I think it is reasonable to put the time duration information in test description.
I think it is okay to add duration options in supertap.test()
we will have something like this
supertab.test('Test Foo()', { ..., duration: test.duration, ... });
but if we concat the test title with test duration inside ava, we don't need to modify the supertab behavior, we also already have pretty-ms as dependency to format time duration.
It is harder than it seem, directly passing the title and duration together like
TAP version 13
# passing 30ms
ok 1 - passing 30ms
1..1
# tests 1
# pass 1
# fail 0
The above output will not work at least if we pipe those output with tap-spec and tap-summary.
When we try to format those output with tap-summary, for example $ cat test.txt | npx tap-summary the result would be:
# Tests
✔ unnamed test [pass: 1, fail: 0, duration: 4ms]
# Summary
duration: 4ms
planned: 1
assertions: 1
pass: 1
fail: 0
or maybe we can put the time duration in supertap comment option?
It should work with tap-spec, tap-summary etc. Perhaps have a look at what other test runners do, e.g. node-tap.
I browse the TAP spec repo and found https://github.com/TestAnything/Specification/issues/16 discussed about time directive.
Using time directive looks nicer to me, but this is not the standard way as the spec just allow two directives:
Directive The test point may include a directive, following a hash on the test line. There are currently two directives allowed: TODO and SKIP. http://testanything.org/tap-version-13-specification.html
TAP version 13
ok 1 - passing # time=5ms
1..1
# tests 1
# pass 1
# fail 0
node-tap uses time directive for SubTest:
A test point associated with a Subtest can also have a # time=... directive indicating how long the subtest took to run. http://www.node-tap.org/tap-format/
but I think ava test reporting doesn't work this way.
another option we put the time duration inside yaml diagnostic
TAP version 13
ok 1 - passing
# Diagnostic
---
time: 3ms
...
1..1
# tests 1
# pass 1
# fail 0
I'm tempted to defer to node-tap, though I must admit I don't fully understand how "subtests" match AVA's behavior. Perhaps it's fine.
Can you find out how the typical Node.js TAP parsers handle duration?
@novemberborn
As far as I know, in node-tap the assertion treat one asertion as one test. In node-tap, when we need to group certain assertions into one, these assertions will be treated as subtest. Ava's assertions is not treated as test like in node-tap. Also in ava we cannot create test() inside test() so ava doesn't have subtest contained in it. Correct me if I am wrong.
I haven't checked the source code of node-tap yet, but I have read an example in http://www.node-tap.org/basics/
The example mentions subtest that has no test contained in it. Here is the example:
TAP version 13
# Subtest: some async stuff
ok 1 - should match pattern provided
1..1
ok 1 - some async stuff # time=9.647ms
# Subtest: this waits until after
1..0
ok 2 - this waits until after # time=6ms
1..2
# time=36.53ms
This example is taken from http://www.node-tap.org/basics/
Then for ava, I think we can do something like:
TAP version 13
# Subtest: test foo
1..0
ok 1 - test foo # time=3ms
1..1
Basically, test foo has no subtest contained (pseudo subtest 😄 ), at least it match the spec mentioned in http://www.node-tap.org/tap-format/ 😄 but as I mentioned in https://github.com/avajs/ava/issues/1668#issuecomment-364721169 , the Test Anything Protocol didn't mention about time directive. Any thought about it?
We can follow node-tap's lead here. We do have #324 so we can show individual assertions in the TAP output.
@okyantoro could I ask you to hold off with this for now? I've started refactoring how the test workers communicate with the main process and how all that communication flows to the reporters. This new architecture should let us fix most open reporter issues and add new functionality, but it's going to conflict with this issue.
I'll try and focus the refactor so it doesn't change functionality, so you can still work on this issue when once it lands.
sure @novemberborn
I'd like to tackle https://github.com/avajs/ava/pull/1776 before changing this behavior.
@novemberborn Should I tackle this issue, on behalf of #2217?
Should I tackle this issue, on behalf of #2217?
If you're meaning to do it first, sure. Or else perhaps as part of that. I'll assign you anyway.
We are optimizing our ava tests and it would be nice to quickly determine the slowest running tests (or test files) from avas point of view (when using serial). Are there any updates on this feature?
Are there any updates on this feature?
@shusson no. It may be more feasible to implement this since have combined the codebase for the "mini" and verbose reporters.
That said you could run with --serial --verbose and you'd see relatively slow tests, IIRC those that take more than 100ms. There is no easy way to filter on that though (which is out of scope for this issue anyway).
This could be useful for https://github.com/avajs/ava/issues/2965 too, as my use case is to measure test files duration (not individual tests), so I can load balance the files across multiple machines.
My current plan is to glob the test files and then time ava --concurrency 1 <each-file>. Having built it support will make this much easier to implement.
If you think it's a good idea to measure duration for test files I can open a separate issue for it.
@erezrokah that seems like quite a specific use case.