JSON-Schema-Test-Suite Impl test report

trafficstars

Based on a conversation with @karenetheridge in Slack, this is the first step toward building an implementation comparison/support site.

The scripts in this PR will run a given implementation against the entire test suite and generate a report. This report can be used to generate site content.

Still have some work to do:

[ ] automatically run on commit to master (GitHub Actions, probably)
[x] run optional tests

Jan 08 '22 21:01 gregsdennis

Woohoo! This will be awesome. Will have a look tomorrow or so, but thanks this is a big deal.

Jan 08 '22 21:01 Julian

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN? This may make it easier for you to generate randomized tests on the fly (and prevent the consuming implementation from potentially cheating, which it can do if it is allowed to see the entire file that the data comes from and therefore see what the correct valid/invalid response should be).

Jan 09 '22 00:01 karenetheridge

What do you think about my idea of passing the data and schemas to the implementation directly via STDIN?

Your CLI can support whatever, but it MUST support what I've laid out. If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

The idea I'm presenting here is that we want to exercise the implementation against the tests in this repo. If a dev wants to make their CLI do more, then that's up to them. Such extra functionality won't be exercised here.

Regarding cheating, the CLI isn't given the metadata of the tests. It's only given the schema and the data, and from that it has to provide an outcome. The script puts the test metadata (including the expected result) and the actual result into the result file.

Jan 09 '22 01:01 gregsdennis

If you want to make it multipurpose so that you could use it outside of this test suite, that's fine.

Thats not what I'm saying. I'm suggesting that if you add to the requirements that the implementation accepts data on STDIN, it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

It'll only submit schemas that exist in the test suite.

It shouldn't limit itself to that.

Regarding cheating, the CLI isn't given the metadata of the tests

It is, if you provide a filename and a reference point within it that is obviously corresponding to the standard layout of the files in the test suite. If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

That's why I think only providing data via files is a mistake -- it is restricting too strongly what structure is used for testing, which means the tool's use is limited.

Jan 09 '22 04:01 karenetheridge

it would make it easier to supply a wider variety of test data, and on an ad hoc basis.

For the purposes of this runner, that's not necessary. This runner will only ever run the tests present in the suite.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

It shouldn't limit itself to that.

Why not? We only have the tests that are present in these files.

If this is what you require, I can provide you with a tool that does nothing but look to a 'valid' property adjacent to 'data' and provide that response.

I think requiring this tells the implementors that we don't trust them. It's not really the signal we want to send. I see no real problem with pointing the CLI at the test case within the files.

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

Jan 09 '22 06:01 gregsdennis

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Jan 09 '22 06:01 gregsdennis

Personally I don't think we need to spend any effort on making sure an implementation isn't gaming the runner, such a thing is likely to be quickly discovered (by the implementation's users) and the consequences will be bad. It's just a high risk no reward to me.

Jan 09 '22 13:01 Julian

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Of course agree with this, but another thing that might be useful for a second pass, which I've already been thinking about, is that I'd like to have a JSON-based file format which represents skips of tests in the suite. I.e. it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing). Today I do this in my implementation in Python code, but I even on my own eventually will move to JSON in my implementation, because it's easier to work with -- standardizing such a thing though may make things easier for this runner (even if we don't formally standardize it, and just invent something for this repo).

Jan 09 '22 13:01 Julian

Additionally, this is a first draft at a runner. It doesn't need to be perfect.

Yup, absolutely! I do think it would be a useful feature to have though, and it might be easier if we allow for that from the start rather than updating Docker containers multiple times.

If you're intent is to just see how implementations might behave with a new scenario before you add it to the suite, that could be done locally easily enough.

Yes, being able to generate tests programmatically would be very useful. The test data could still be written out to a file as an intermediary, but that seems like an unnecessary step when we could just stream them.

Even if the data comes from a file, we should not presume that the structure is identical to the existing test suite -- and providing examples that give json pointers that exactly match the test suite structure does imply that. How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

Truthfully, an enterprising cheater could hard-code all the test cases into the tool with their expected result, which would still give a 100%.

Yes, that's one reason why generated tests are useful.

I'm not really worried about cheating. If we're concerned, we can ask to see the source for the tool.

It's not about cheating specifically (although we have talked about this in the past, which is when we first started talking about randomly generating tests and feeding them to implementations blind, last year), but also about generating a wider variety of tests that we may not want to commit to the test suite directly. If you recall, I submitted a more comprehensive set of tests last year (PR #385) and they were rejected because of the size, with the suggestion that we could programmatically generate these tests and send them directly to implementations when such a capability was available. I'd still like to be able to do that.

Jan 09 '22 23:01 karenetheridge

it should be possible to state in JSON what tests from the suite you want to run (across required and optional) and what you want to flag as skipped (or known failing)

I would suggest we devise some kind of indexing system. It could help with the report as well. The index for a given test would need to remain consistent as the suite changes.

providing examples that give json pointers that exactly match the test suite structure does imply that

The pointers submitted to the CLI don't need to match the test files. The runner is reading from the files, so in this case, they do. The pointer could specify anywhere.

How about the example saying something like --schema test.json#/0/schema --data test.json#/0/data instead?

There's nothing that's preventing this from working. Any file and any location within it is fine.

the suggestion that we could programmatically generate these tests and send them directly to implementations

We're not doing this right now. I want to focus on what we're doing. When we figure out how to generate tests, we can have people update their CLI tools. This is the more iterative approach and will get us running something faster.

Jan 10 '22 03:01 gregsdennis

I'll give a shot to writing a hook for my own implementation this weekend I think, may have some feedback on the implementer-facing instructions after doing so (though they seem good already).

Jan 14 '22 14:01 Julian

@gregsdennis are you OK with me closing this as superceded by Bowtie (which does this now at least as much as the PR did)? (Thanks, you def get a ton of the credit for finally pushing for this to happen...)

Nov 08 '22 15:11 Julian

Yep. Bowtie looks great.

Nov 08 '22 17:11 gregsdennis

JSON-Schema-Test-Suite JSON-Schema-Test-Suite copied to clipboard

Impl test report

JSON-Schema-Test-Suite
JSON-Schema-Test-Suite copied to clipboard