Specification Test2 subtests not supported?

https://github.com/Test-More/test-more/issues/942

Pretty sure in our discussions we decided the Test2 format would be supported formally by TAP14. I have not verified the claims in this ticket yet.

Sep 12 '22 03:09 exodist

problems with Test2 format:

it appends character to reason so to make it stream-parsable spec should prohibit tailing '{' in reason / directive description
- for example:
  - which one is subtest? ok 1 or ok 3 ?

ok 1 - {
    ok 2
    1..1
ok 3

where should diagnostic message come?
- after result (ie as first lines after subtest start? same indent or subtest indent?)
- or after closing } ?
subtest end (unlike TAP14 # Subtest: declaration) is invalid line accorting TAP14

From parsing point of few (for automated processing of test results), Test2 subtests removes one advantage over junit - stream parsing. When you removes that, junit becomes better choice.

Sep 14 '22 07:09 happy-barney

I agree with @happy-barney here. The way subtests were originally designed (er, hacked into the mess that was the old test harness) is that test output would be line-based and the final summary "ok/not ok" of the subtest would simply be after the test:

ok 1 - Starting engines
    ok 1 - engaging hyperdrive
    ok 2 - setting course
    not ok 3 - Exceeding lightspeed
not ok 2 - Sci-fi trope

The semantics are clear and we don't have starting and ending braces forming "blocks" to parse. The subtest summary output was always after the subtest ran. A parser can immediately take action on a not ok line (such as halting the test suite, if needed). Further, because the TAP parser was supposed to simply ignore any lines that were not valid TAP, missing the leading or trailing curly brace would not be an issue, but now I'm less sure of that.

Is there a benefit for this { subtest } syntax that I'm missing?

Sep 15 '22 06:09 Ovid

The idea behind the braced/block format is to indicate a "buffered" subtest. Buffered subtests were added so that subtests could run concurrently without confusing a parser. For example 2 subtests running concurrently without buffering can result in this:

ok 1 - blah
    ok 1 - subtest a assertion a
    ok 1 - subtest b assertion a
    ok 2 - subtest b assertion b
    ok 2 - subtest a assertion b
ok 2 - subtest b
ok 3 - subtest a

There are now several tools on cpan that use Test2::AsyncSubtest, tools like Test::Class::Moose, Test2::Tools::Subtes, Test2::Workflow, etc. They rely on the buffered subtests because they allow subtets level concurrency, which is not really possible using the old style subtests that stream subtest events immediately.

For these tools only 1 subtest can print at a given time, and it is printed all at once with the braces to contain it.

Sep 15 '22 13:09 exodist

@exodist buffered subtest is an implementation ... does it matter when you read test output?

Sep 15 '22 13:09 happy-barney

@happy-barney I am not sure I understand the question. When you read it does not matter, but as my example shows the order of the output DOES matter. If you have 2 concurrent subtests and they both output simultaneously then there is no way for a harness to deduce which subtest each indented line belongs to.

In any case, I feel like the ship has sailed here. This type of buffered subtest has been around for years, and is used many places, changing the format now would break several cpan modules, but it would also break several companies that I know are using it.

I mean if you want I am fine adding documentation to some modules that states things like "There is an argument against this a type of subtest and its output, here is the ticket ...". But I am not going to break a ton of people by removing it. I also find it much cleaner and easier to parse in Test2::Harness.

I only have a little bit of energy to engage in this. If nobody likes this and wants to keep it out of the TAP standard then fine, but I will keep the implementation and at most warn in docs that anyone who requires pure TAP can use something else. (Note Test::More does not use this, only using Test2::AsyncSubtest and Test2::Subtest does, so modules currently in perl CORE are not effected)

Sep 15 '22 14:09 exodist

@exodist I have nothing against with buffering, that is good

I just don't see any reason to say "this is buffered subtest output" (from parsing TAP later point of view). buffered subtest can still output

# Subtest: subtest message
    ok 1
    ok 2
    1..2
ok 1 subtest message

Sep 15 '22 14:09 happy-barney

Ah, brief history:

TAP was designed to be both human and machine readable, I feel that as a result it did both poorly
Old school perl programmers get used to it, but teaching people to use/run/write tests has always resulted in confusion around subtests because it feels backwords
Adding a comment to note when a subtest starts feels like a hack
When I took ownership of Test::More 10ish years ago I had a lot of people asking me to "fix" subtests and what they look like
One such time was a request to "outdent the subtest comment" so that the comment lined up with the subtest events, which spawned a several year development process to write Test2 to avoid the problem that changing something as simple as a comment indentation would not break all of cpan in the future (Everyone did string comparison of TAP output to verify their custom test tools)

So, when I started writing Test2 with the primary goal of "Fix the pain points" subtest format was one of the issues high on the list of things that cause people pain, pain that leads to complaints coming to me.

At this point TAP is just the default/fallback used by Test2, which supports formatter plugins. If you use Test2 with Test2::Harness then TAP may never even enter into the pipeline, Test2::Harness uses a machine readable format (JSONL) to send data around internally, and between the test and the harness, then a separate human-friendly output format to the terminal.

But to summarize the rant: I got too many complaints about the old subtest format, and when training people in testing the subtest format seemed difficult for newbies.

Sep 15 '22 14:09 exodist

IIRC (will try to dig up a link a bit later), the decision was to keep indented/commented style subtests, since there was already support in multiple implementations, and leave buffered subtests for later (the initial draft that I wrote included buffered subtests, but they raised some questions).

I think it'd be good to bring them back in a TAP 15. They're nice and imo much easier to read, since you get the bottom line up front. Node-tap has supported them for years. We just need to come to alignment on these questions (non exhaustive list):

Where do diagnostics for the summary test point go?
What happens when the trailing } is missing? (Or perhaps, is a test point ending in { that isn't a leading point for a buffered subtest just an error?)

I believe that we said "3 major implementations supporting" is the line for quorum on additions, so another blocker would be someone else implementing it (and perhaps, Test2 and node-tap meeting on any potential discrepancies they may have, but I think we're pretty close, since I modeled node-tap's support after Test2.)

Nov 12 '22 22:11 isaacs

additional question: does anyone need to distinguish whether subtest was buffered or not ?

Nov 13 '22 00:11 happy-barney

additional question: does anyone need to distinguish whether subtest was buffered or not ?

I haven't found this particularly valuable in node-tap, tbh. I like the buffered Test2 style better when reading the tap itself, but I changed node-tap to always emit comment-prefixed "standard" subtests (even when they are actually async and buffered), and it's not really any more work. You still have to buffer up the data and then emit it all at once with a wrapper, it doesn't make much difference if the wrapper puts the summary at the top or the bottom. I haven't encountered any cases where it makes a difference for the harness to know whether subtests were run in sequence or in parallel.

Nov 14 '22 17:11 isaacs

it makes difference when you are using non-console tools to process TAP Test2 can generate two plan rows in row which will make tools not supporting not-published-yet (and not-specified in output) TAP version to fail

Aug 13 '23 12:08 happy-barney

Test2 can generate two plan rows in row

What does this mean? Can you give an example?

Aug 13 '23 17:08 isaacs

# Subtest: foo
ok 1 - foo {
   ok 1
   1..1
}
1..1

Aug 13 '23 17:08 happy-barney

@happy-barney Right, but this issue is specifically about the braced "buffered" subtest style vs the indented/commented subtest style in TAP 14. How is that any different from what's in the spec?

# Subtest: foo
    ok 1
    1..1
ok 1 - foo
1..1

Aug 13 '23 18:08 isaacs

your example is ok, that braced subtest is not ... my original complain was that Test2 suite uses this braced, which is not supported by 3rd party TAP parsers. If they want braces, they should add them for example as

# Subtest: foo {
    ok 1
    1..1
ok 1 - foo }
1..1

Aug 13 '23 18:08 happy-barney

I am not changing the buffered subtest brace style in Test2. I chose them for human readability/aesthetics. If people do not like the brace style for whatever reason, they can choose not to use them. I am not going to break what is now years of downstream things that depend on them being this format. The most I might do is give people an option (opt-in) that allows them to render buffered subtests without the braces using the old style, comment-indented-result.

At this point Test2 uses TAP as a default only for historical reasons. If someone uses yath, the default output/intermediary is a jsonl stream. TAP is an incredibly lossy format, it is intended to be good for both humans and machines, and in my option ends up terrible for both. Having written a handful of TAP parsers now I have experience to back up that it is terrible for machines, and lossy. The human side is more subjective, so meh, not interested in arguing that point.

JUNIT is also pretty awful, but is a much more widely supported format.

Things are moving away from TAP, and I support this because TAP is awful. Test2's buffered subtests have been around for the better part of a decade, and at this point TAP can either support it or not. Neither choice will make the world end or effect anyone significantly. Test2 is not going to make a radical change to support a third party standards body where 1 or 2 people dislike how it looks.

Aug 14 '23 15:08 exodist

Also, node-tap is "third party" and supports the braced Test2 style. It's not that hard. I don't think test2 has to change, and making that style officially part of a TAP spec only needs a formal description and a third popular implementation.

Aug 15 '23 01:08 isaacs

In the braced variant that Test2 generates, where do SKIP/TODO directives end up?

After the opening brace:

not ok foo { # TODO feature xyz not yet merged
  ok x
  not ok y
  ok z
}

or inside:

not ok foo # TODO feature xyz not yet merged {
  ok x
  not ok y
  ok z
}

Apr 12 '24 09:04 renormalist

ok 1 - foo { # TODO This is todo
    ok 1
    1..1
}
1..1

Apr 25 '24 08:04 exodist

Specification Specification copied to clipboard

Test2 subtests not supported?

Specification
Specification copied to clipboard