in-web-browsers
in-web-browsers copied to clipboard
Test suite for HTTP Gateways
Current state
- Specifications for HTTP Gateways exist now at ipfs/specs/http-gateways
- For historical reasons, comprehensive, ent-to-end HTTP Gateway tests live in Kubo IPFS implementation (
*gateway*.shin kubo/test/sharness) - We have only four simple tests at https://ipfs.github.io/public-gateway-checker/, and they are not robust enough.
Target audience
- Gateway operators wanting to ensure their reverse proxy/cdn setup is correct
- Gateway implementer, ensuring they did not miss any nuance / edge case around ipfs/specs/http-gateways
- IPFS user making decision which gateway is useful for their needs
Needs
- Ability to run tests against HTTP endpoint and get a compliance report against ipfs/specs/http-gateways
- Being able to test each gateway type separately (e.g., Block/CAR gateway without anything else)
- Runs via CLI and on CI
- Something Kubo, Iroh, and other gateway implementations could run on their CI
- Human readable errors with references to relevant section of ipfs/specs/http-gateways
- (nice to haves)
- Generate HTML reports, so people can publish previews on PRs
- Generate JSON, allowing augmentation of https://ipfs.github.io/public-gateway-checker/ (pre-generate tests on CI for things that can't be tested live via JS)
@lidel I think we can decouple the functionality from public-gateway-checker and use that as the UI that makes use of this test-suite/validation functionality.
we can distribute the validation checks as an ESM which can then be used anywhere in-browser/on node and generate appropriate compliance reports. CLI functionality would be straightforward using npx.
The caveat is that there are HTTP tests which can't be done via JS running on a webpage (due to things like access to HTTP headers, redirects, CORS etc). We will never have a truly isomorphic test suite. Checker would be consuming this, but we should not design it around that use case. The target audience is different.
@darobin may have some ideas how to get the biggest bang for the buck here, and how to avoid maintenance headache (this thing will live for decades, requires a bit different planning).
I would also worry that the test suite could become rather large as more corner cases get covered, which might not align well with what is good for public-gateway-checker (I'm not sure it's supposed to be fully comprehensive).
What I have in mind is (roughly, with ideas taken from WPT):
- The primary use case is CI. This isn't to say that other uses are precluded, just that the goal is interop and what's in CI gets fixed. This provides an incentive for users who find an interop issue to contribute a test case.
- The test runner gets passed the location of the gateway (it could also start the daemon itself, WPT does that with Web Driver, we can look into that).
- Reporting in various formats is easy. I'm not sure what you have in mind with "previews on PRs" @lidel? It would also be easy to report in MD, meaning the output could then just be pasted into the PR description if that's useful.
- This should "just" be a subset of the full test suite, but running just a subset (or even a single test) ought to be easy.
- "Human readable errors" likely depends more on the tests than on the test runner. However, I was considering the idea that we could have the tests and the specs in the same repo (basically the interop monorepo). So the tests in
tests/ipfs/gateways/*would be for the spec inspecs/ipfs/gateways.md. That means the tests could easily reference notions defined in the specs (because the metadata for that is automatically exported when the spec is generated) so you could eg. have a test string statingcheck that the :ref[OPTIONS method] lists all verbsand have the report link to the right bit of the spec. - Conversely, running the test with a
--saveflag would output results for that implementation to the same repo, which in turn the specs could use to showcase how well supported they are, à la caniuse.com.
Anyway, these are just a few thoughts, I'm not married to anything — this needs to work for implementers, not me! As I was saying in Slack, I'm happy to bang something together so folks can see if they like/hate it.
Ok, I made a prototype — let me know if you think this is headed in the right direction. The runner is over at https://github.com/darobin/ipseity/blob/main/bin/bastest.js, and I made a quick and dirty port of the Kubo block tests at https://github.com/darobin/ipseity/blob/main/test-suite/ipfs/gateways/blocks.js.
The runner expects a gateway, and can be told to be quiet, to save JSON, and to output Markdown (it can do these in parallel).
A failing gateway: ./bin/bastest.js --gateway https://berjon.com/ :
A successful gateway (assuming ipfs daemon --writable) run that also saves JSON and MD: ./bin/bastest.js --gateway http://127.0.0.1:8080/ --save robin-test_0.0.1 --markdown scratch/test.md:
The JSON it saves looks like this: https://github.com/darobin/ipseity/blob/main/test-reports/robin-test_0.0.1.json. Nothing phenomenal (and we'd have to figure out some useful conventions to produce interop reporting from it) but we can figure out workflows/configurations.
The MD it outputs is like this, just pasting it raw here:
Interplanetary Test Suite
Test HTTP Gateway Raw Block (application/vnd.ipld.raw) Support
GET unixfs dir root block and compare it with expected raw block
- ✅ GET with format=raw param returns a raw block
- ✅ GET for application/vnd.ipld.raw returns a raw block
Make sure expected HTTP headers are returned with the block bytes
- ✅ GET response for application/vnd.ipld.raw has expected response headers
- ✅ GET for application/vnd.ipld.raw with query filename includes Content-Disposition with custom filename
- ✅ GET response for application/vnd.ipld.raw has expected caching headers
Anyway — thoughts welcome!
Oh — I forgot an important point: one limitation of this approach is that the gateway must be at least minimally writeable. Otherwise there's no way to put the fixtures there. There could be some other ways of doing the same, but I don't think that they are any better. Unless there's a better idea that I haven't considered, I suspect that this is the best option.
@darobin can we not host a fixture and hard-code CIDs? like the current gateway checker does? e.g. https://ipfs.io/ipfs/bafybeifx7yeb55armcsxwwitkymga5xf53dxiarykms3ygqic223w5sk3m#x-ipfs-companion-no-redirect
We don't need to make the gateway writable.
@whizzzkid Yes, but I am concerned about making the test suite dependent on external infrastructure, network availability, etc. Ideally, it would be self-contained. I also worry about making this rely on PL infra (though I'm sure that we could convince others to host the fixtures too). If this is important enough, we could consider having it both ways, with a fixtures manifest that would drive some pinning and some kind of NO_WRITE env flag that would skip the writing when defined.
I think this type of runner would do the trick, at least for gateways. Rewriting all the tests will be tedious, but doable in chunks.
@darobin for I/O-less tests we can leverage inlined CIDs (use indentity multicodec instead of hash digest).
$ echo 'hello inlined' | ipfs add --raw-leaves --inline
added bafkqadtimvwgy3zanfxgy2lomvsau bafkqadtimvwgy3zanfxgy2lomvsau
CID bafkqadtimvwgy3zanfxgy2lomvsau includes data in the multihash digest, which means resolving it does not require gateway to do any IO/networking.
Existing tests
I created a high-level overview of tests we currently have living in Kubo - https://gist.github.com/galargh/74f2ecaf9da2f590756785694719566c
Findings (that are probably obvious for all involved here already but helped me reason about the space):
- most of the tests rely on Kubo daemon for creating entities (
ipfs add)- Could we decouple entity creation from the actual test? We could prepare a set of entities that must be available on the gateway for the tests and a script that uses a writeable gateway API to publish these entities. If a gateway supports the write API, it could use the script. Otherwise, the gateway owner would have to take care of entity creation.
- some of the tests rely on the Kubo daemon for verification of gateway responses (
$RESPONSE == $(ipfs cat))- Would it be possible always to use concrete expected values?
- some of the tests reconfigure gateway settings during the run
- Could we split such tests so that all the test cases within a single test would be expected to pass with an immutable gateway configuration?
- some of the tests spoof the
Originheader - other than that, the tests follow a pretty standard pattern of request and verify response code, headers and body
If we could separate Gateway configuration, including entity creation, from tests, could the tests become a collection of fixed request and response parameters?
Finally, having these tests provides a fantastic opportunity. We can leverage it to ensure the new test suite works as expected by running them in tandem.
Where should this live?
I was considering the idea that we could have the tests and the specs in the same repo
I like this idea. It would enable a process where spec changes would have to be backed by accompanying tests. I wonder if we could somehow leverage a tool like https://github.com/warpfork/go-testmark to tie the two more closely together.
We should also make it easy for specs contributors to contribute to the testing suite. What language are they most likely familiar with? Are there any testing frameworks they are already more familiar with that we could lean on?
Reporting
The runner is over at https://github.com/darobin/ipseity/blob/main/bin/bastest.js
Would it make sense to reuse existing reporters?
A few quick notes:
- Yes, we could ensure that a number of fixtures are hosted and use those. We can also use the
identitymulticodec for at least some of the tests. - We can definitely use concrete expected values if we have them locally, which I reckon we should.
- We should assume an immutable gateway configuration; if there are different useful behaviours that could be exposed independently, maybe we should consider making them part of the protocol?
- Spoofing
Origin: not a big deal, we can set arbitrary headers. Or am I missing the issue? - Fixed request/response: some bits aren't going to be regular or consistent (notably some headers) so I don't think that it can be purely fixed. However, I think it's not far from the truth so it should be almost that.
- There has been a lot of hesitation around having everything in a monorepo so I don't think that'll fly but happy to chat about it.
- Reusing reporters: Mocha only supports one reporter at a time and that thing reports to three outputs at the same time. It's just a quick hack, we don't have to do it that way. One alternative is to report to TAP and convert. (It's not less code or necessarily simpler, though.)
Thanks Robin!
Yes, we could ensure that a number of fixtures are hosted and use those. We can also use the identity multicodec for at least some of the tests.
I've seen that comment. Yes, I think that's a great pattern we should try to follow where possible and it makes sense.
Spoofing Origin: not a big deal, we can set arbitrary headers. Or am I missing the issue?
No, not at all. I just wanted to make a note of it.
Fixed request/response: some bits aren't going to be regular or consistent (notably some headers) so I don't think that it can be purely fixed. However, I think it's not far from the truth so it should be almost that.
Yes, I saw that there's a bit more going on in your example test with the cache-control header checks for example.
There has been a lot of hesitation around having everything in a monorepo so I don't think that'll fly but happy to chat about it.
Alright, I think if these discussions already happened, then there's no reason for us to reopen it. Let's just start on the side and we'll see what the future brings.
Reusing reporters: Mocha only supports one reporter at a time and that thing reports to three outputs at the same time.
I found https://www.npmjs.com/package/mocha-multi which extends mocha with multi-reporters support. It works pretty well. Having said that, I haven't found their built-in reporters (especially, html and md) too pretty so we might still want to invest into creating custom ones, but we can make them stand-alone and general purpose.
I put together a prototype of what I was thinking of to help me structure my thinking a bit and to aid with our upcoming discussion on how we want to proceed. I ended up with https://github.com/galargh/gateway-conformance It implements the exact same test as https://github.com/darobin/ipseity/blob/main/test-suite/ipfs/gateways/blocks.js In fact, I heavily used it to create my prototype 🙇
The bits that I find interesting:
- declarative test definition https://github.com/galargh/gateway-conformance/blob/main/test/raw-block.js#L8-L67
- fixtures defined separately so that they can be used by tests and by provisioners https://github.com/galargh/gateway-conformance/blob/main/util/fixtures.js#L104-L110
- support for different provisioners (enables creating custom provisioners per implementation) https://github.com/galargh/gateway-conformance/blob/main/util/provisioners.js#L4-L7
- parsing fixtures to compute CIDs, etc. using JS native UnixFS libs (i.e. no hardcoded CIDs) https://github.com/galargh/gateway-conformance/blob/main/util/fixtures.js#L40-L73
- multiple reporters out-of-the-box https://github.com/galargh/gateway-conformance/blob/main/package.json#L9
- how it comes together in CI (you can spot easily replaceable bits, e.g. provisioning) https://github.com/galargh/gateway-conformance/blob/main/.github/workflows/test.yml
One note, is https://github.com/galargh/gateway-conformance/blob/main/util/provisioners.js#L30-L33, or https://github.com/darobin/ipseity/blob/main/test-suite/ipfs/gateways/blocks.js#L27-L34 for that matter, supposed to create a dir? I couldn't get it to work so I relied on Kubo provisioner.
Re mocha-multi I tried it, but it didn't like me at all so I gave up fighting it (the docs do say that it's a pile of hacks) and just put it all in there :) But if it works then 🚀!
I like your thing, it looks great and very usable! Really cool stuff.
I don't believe that there's a reliable way to create dir that way — I put it directly in the fixture.
(note we're exploring options for this work in https://github.com/ipfs/gateway-conformance)