Specification icon indicating copy to clipboard operation
Specification copied to clipboard

Interpreting TAP streams lacking version numbers

Open isaacs opened this issue 3 years ago • 13 comments

https://github.com/TestAnything/Specification/pull/25#discussion_r1056428795

Let's pull this into a new issue here :)


eli-schwartz 1 hour ago I implemented this in Meson as a passing test which verbosely warns you not to do this. I got a response:

https://github.com/mesonbuild/meson/pull/11186#issuecomment-1362809587

Meson is able to parse TAP12 and TAP13, therefore it does not have to warn if the version line is missing. IMO the "Harnesses may treat any TAP stream lacking a version as a failed test" is just silly. The point of TAP is exactly that it is easy to produce without relying on a library that handles the versioning and formatting for you, so good luck enforcing that with dozens of TAP producers. Thoughts?

@eli-schwartz eli-schwartz 1 hour ago More importantly:

In fact the test-anything.org page says:

Here’s what a TAP test stream looks like:

1..4 ok 1 - Input file opened not ok 2 - First line of the input valid ok 3 - Read the rest of the file not ok 4 - Summarized correctly # TODO Not written yet Which regardless of anything else seems to be a problem -- the homepage of https://testanything.org/ describes a TAP stream without the soft-mandatory version, so it is valid TAP 12 but a TAP 14 harness can report it as a test failure. Maybe this should include a version...

@bonzini bonzini 1 hour ago Thanks for bringing it up here @eli-schwartz. In my opinion the correct reading of the spec is:

a TAP harness MUST treat a missing version line as version 12

a TAP harness MAY reject versions below any limit they would like to apply (even though it's not a good idea :))

The latter does imply rejecting input without a version line (which puts the limit at version 13), but does not mean that harnesses should warn if the version line is absent.

isaacs avatar Dec 23 '22 17:12 isaacs

This is an interesting question. I wonder what the behaviors are in practice across implementations? That should be our guide, in my opinion.

For node-tap, the version line is completely optional, and the parser will faithfully emit a version event with whatever version it finds, as long as it comes at the start of the TAP stream. (A TAP version [0-9]+ line that appears later on in the stream is emitted as an extra event, the same as any other unrecognized content. It's only an error if in strict mode, either via pragma or by instantiating the parser with { strict: true }.)

But, all streams are interpreted as TAP 14, regardless of reported version. So you can have child tests, pragmas, etc., regardless of the reported version.

I'd say, if other implementations are the same, then that's a bug in the spec, which we should fix. If other implementations are different, then that's a bug in node-tap, which I should fix. (Or at least, a deviation from the spec that should be called out, made opt-in, etc.)

Personally, I prefer node-tap's behavior here. Maybe we could thread the needle somehow by saying that a version greater than a harness's supported version may be an error or warning (though maybe also state that the harness should do its best with it, since TAP is likely to continue to shoot for backwards-compatibility), and that a missing version should be interpreted as TAP 14 (since TAP 14 is backwards compatible to TAP 12 anyway).

What do you all think?

isaacs avatar Dec 23 '22 17:12 isaacs

I'd be okay with defining it as "missing version means 12, and TAP harnesses cannot treat that as an error unless they treat version 12 as an error due to not implementing 12". Which seems unlikely that someone wouldn't implement version 12. :laughing:

It seems to me that that was what TAP13 specified.

eli-schwartz avatar Dec 23 '22 17:12 eli-schwartz

Maybe we could thread the needle somehow by saying that a version greater than a harness's supported version may be an error or warning

FWIW meson currently just swallows it (we only handle version 12/ 13) but the version field is used in accordance with:

A Harness may silently ignore invalid TAP lines, pass them through to its own stderr or stdout, or report them in some other fashion.

to select whether the "report them in some other fashion" tells the user to report it as a feature request to implement (because we don't implement TAP 14 functionality yet), or "probably a bug in the test". In either case, the test passes, it's just noisy.

and that a missing version should be interpreted as TAP 14 (since TAP 14 is backwards compatible to TAP 12 anyway).

So I think you're suggesting e.g. to change:

Harnesses may treat any TAP stream lacking a version as a failed test.

to

Harnesses must not treat any TAP stream lacking a version as a failed test. Instead, it shall be considered TAP 12.

And change:

Harnesses may interpret ostensibly TAP13 streams as TAP14, as this specification is compatible with observed behavior of existing TAP13 consumers and producers.

to:

Harnesses may interpret ostensibly TAP12 or TAP13 streams as TAP14, as this specification is compatible with observed behavior of existing TAP12 and TAP13 consumers and producers.

eli-schwartz avatar Dec 23 '22 19:12 eli-schwartz

So I think you're suggesting e.g. to change:

Yes, I think that would bring node-tap into compliance, and seems sensible to me. Just need two other implementations to agree, assuming we're still following the rule of 3.

That brings up another question: what's the path to make further edits and refinements to the spec, if they introduce some functional change like this? Even if it's to fix a spec bug, do we call that "tap 15"? That seems heavy handed. "14.0.1"?

isaacs avatar Dec 23 '22 21:12 isaacs

I'd be okay with defining it as "missing version means 12, and TAP harnesses cannot treat that as an error unless they treat version 12 as an error due to not implementing 12". Which seems unlikely that someone wouldn't implement version 12. laughing

Yes, this sounds sensible to me too

Leont avatar Dec 23 '22 21:12 Leont

I would say that it technically means the same thing depending on the exact wording, so give it a bugfix release number.

I'll further consider the wording over the weekend and submit a change, I guess. :smile:

eli-schwartz avatar Dec 23 '22 21:12 eli-schwartz

My proposed wording:

To indicate that rest of the output is TAP13 or newer, the first line must be

    TAP version <major version number>

If the first line of the input is not a version line, harnesses MUST behave as if a `TAP version 12` line was present.

Harnesses MAY require a minimal version and reject any input that specifies an older version. Harnesses MUST NOT treat a TAP stream lacking a version line as a failed test, unless they do not support TAP version 12 input.

Harnesses MAY interpret ostensibly [TAP13](https://testanything.org/tap-version-13-specification.html) streams as TAP14, as this specification is compatible with observed behavior of existing TAP13 consumers and producers.

bonzini avatar Dec 24 '22 15:12 bonzini

Submitted an attempted rewording.

Treating a missing version number as implicitly version 12, and then silently handling that as a valid/supported TAP stream, is what Meson used to do before the TAP14 wording, and I am more than happy to revert to that state of affairs ;) so,

Yes, I think that would bring node-tap into compliance, and seems sensible to me. Just need two other implementations to agree, assuming we're still following the rule of 3.

I guess Meson counts as the second?

eli-schwartz avatar Dec 29 '22 02:12 eli-schwartz

I'd be okay with defining it as "missing version means 12, and TAP harnesses cannot treat that as an error unless they treat version 12 as an error due to not implementing 12". Which seems unlikely that someone wouldn't implement version 12.

Sounds good to me too.

kinow avatar Dec 29 '22 09:12 kinow

Having slept on this, the only concern I'd have is that it seems like if you have a stream like this:

1..1
# Subtest: child
    1..1
    ok
ok 1 - child

I want to interpret that as TAP14 (ie, handle the child test as a child test, not as non-TAP lines).

But, if the rule is If the first line of the input is not a version line, harnesses MUST behave as if a `TAP version 12` line then that seems like the child test must not be interpreted as a child test. That seems weird to me?

What about something like this:

To indicate that rest output is TAP13 or newer the first line must be

TAP version <major version number>

If the first line of the input is not a version line, harnesses MUST behave as if a TAP version 12 line was present.

Harnesses MAY require a minimal version and reject any input that specifies an older version. Harnesses MUST NOT treat a TAP stream lacking a version line as a failed test, unless they do not support TAP version 12 input.

Harnesses MAY interpret ostensibly TAP13 or TAP12 streams as TAP14, as this specification is compatible with observed behavior of existing TAP13 consumers and producers.

isaacs avatar Jan 02 '23 22:01 isaacs

Tbh, I think the "TAP version" line is kind of a regrettable wart. I always forget to write it when I'm manually outputting test results, and it seems like its only purpose is to enable adding non-backwards-compatible features to the spec, which we all agreed is kind of a bad idea anyway. 🤷 I'd be fine with removing it entirely in a future TAP version, though I realize I'm probably in the minority on that.

isaacs avatar Jan 02 '23 23:01 isaacs

Yes, I agree that the extra paragraph makes things clearer.

bonzini avatar Jan 03 '23 06:01 bonzini

That paragraph contradicts the TAP Philosophy:

  • Ignore any unparsable lines

Do you want a consumer to rigorously adhere to the TAP spec, or do you want them to follow that directive and ignore any lines that are unparsable?

memreflect avatar Apr 02 '23 18:04 memreflect