runtime-tools icon indicating copy to clipboard operation
runtime-tools copied to clipboard

validation requires a network connection

Open cyphar opened this issue 8 years ago • 7 comments

At the moment, oci-runtime-tool validate attempts to fetch https://raw.githubusercontent.com/opencontainers/runtime-spec/v1.0.0/schema/config-schema.json. This is a problem if you want to do validation without having access to the network (and it's also a concern with distributions that would prefer to be able to package the schema somewhere). There are two ways I can see that we deal with this problem:

  • Just include the json file in the source code, so that it's baked into each binary. Obviously this has the downside that you can't test against future versions of the runtime-spec -- but ultimately we do a lot of validation outside of the json so it's of questionable use to be able to test future versions partially. This is what image-tools does (but the data is stored in the upstream image-spec repo).

  • Have some local cache (something like /usr/share/oci/runtime-tool/schemas/vXYZ.json) that is used if available, otherwise the schema is downloaded (and saved there if we have enough permissions). This is more fallback-friendly, and still has the "future versions" feature, but it's also kinda clunky in some senses.

cyphar avatar Oct 26 '17 04:10 cyphar

I think future version is not a problem, runtime tool can not and should not guarantee it will support the future version. To me, the problem is 'how can we get schema files'.

We won't have this problem if we release runtime-tools as the rpm package. We can put json files to rpm package, so when a user install it, he/she can have both a binary and a couple of data files. But now we only have a binary file if we get the runtime-tools by go get :(

So +1 for your second proposal, maybe we should add a new command like

    oci-runtime-tool init

so all the required data could be installed in the right place.

liangchenye avatar Oct 26 '17 10:10 liangchenye

We won't have this problem if we release runtime-tools as the rpm package.

Well, we already do this in openSUSE -- and I use the validation suite as part of umoci's test suite. :wink:

My issue with the second proposal is that it's not consistent with how oci-image-tools works, and also it feels more clunky than just including the schema inside the Go binary itself. oci-image-tools does this by having a in-memory "filesystem" which is generated using the tool "esc".

cyphar avatar Oct 26 '17 12:10 cyphar

On Thu, Oct 26, 2017 at 10:05:19AM +0000, 梁辰晔 (Liang Chenye) wrote:

So +1 for your second proposal…

Also +1 to this (previous discussion in #197, e.g. 1).

… maybe we should add a new command like

    oci-runtime-tool init

so all the required data could be installed in the right place.

If we use $XDG_DATA_DIRS (instead of the $XDG_CACHE_HOME I'd recommended earlier) we'd get good system-level locations for storing this data (e.g. if folk package the JSON Schema for their distro, possibly in per-spec-release packages) 2. Then the logic would be either:

a. Check $XDG_DATA_DIRS for the JSON. Fall back to the network if they're missing, but don't save anything fetched from the network, or b. Check $XDG_DATA_DIRS for the JSON. If they're missing, fetch all JSON for your requested spec version into the highest-preference $XDG_DATA_DIRS. Then run the validation from the $XDG_DATA_DIRS location that contains the files.

wking avatar Oct 26 '17 16:10 wking

@wking I don't personally agree with the decision to not use esc in #197 (requiring a network for a file which is going to be static for any given version of the spec doesn't make any sense to me -- having it hard-coded into the binary makes more sense, at least as a default). Maybe if we just did something simpler like generate a const string which contains the JSON and is then parsed by gojsonschema would be nicer? If we go with that, I'd like to match the change in image-spec (for consistency reasons).

As for $XDG_DATA_DIRS, I'd still want an explicit global fallback in case a user sets $XDG_DATA_DIRS to something silly. Also a matching change in image-tools (this inconsistency between tools that are maintained by similar groups of people is a little worrying).

cyphar avatar Oct 26 '17 16:10 cyphar

On Thu, Oct 26, 2017 at 09:52:26AM -0700, Aleksa Sarai wrote:

@wking I don't personally agree with the decision to not use esc in #197 (requiring a network for a file which is going to be static for any given version of the spec doesn't make any sense to me -- having it hard-coded into the binary makes more sense, at least as a default).

Which versions would you compile in? I think the long-term goal is to be able to test against multiple spec versions. See #98 and 1 for more on this. With the current JSON Schema approach, we have the ability to perform preliminary validation on spec versions which were unreleased when a given runtime-tools version was cut. That validation won't pass (raising an “unsupported version” error, but that bail-out happens after the JSON Schema validation. For example:

$ ./oci-runtime-tool validate 2 errors occurred:

  • Could not read schema from HTTP, response status is 404 Not Found
  • validate currently only handles version 1.0.0, but the supplied configuration targets 1.1.0

If we'd published 1.1.0 JSON Schema, that JSON Schema check would have gone through.

And even if we decide not to support future versions, would you compile in all (supported) past versions of the JSON Schema? I don't see a need to bundle all of that together, when loading what you need from the filesystem and/or network is so straightforward.

As for $XDG_DATA_DIRS, I'd still want an explicit global fallback in case a user sets $XDG_DATA_DIRS to something silly.

If you set $XDG_DATA_DIRS to something that doesn't contain the JSON Schema and you don't have a network connection, I'm fine giving you the 404 error shown above. I don't think we need to babysit users who decide to explicitly set silly $XDG_DATA_DIRS values.

Also a matching change in image-tools (this inconsistency between tools that are maintained by similar groups of people is a little worrying).

I agree that consistency is nice, but with the image folks not happy with their fs.go maintenance 2, I'd rather not follow them down that path.

wking avatar Oct 26 '17 17:10 wking

I think the long-term goal is to be able to test against multiple spec versions.

I'm not sure that this is something that makes a lot of sense to do "automatically" by just downloading a schema file from the internet and then trusting that the validation is correct. There's already a lot of additional checking we do in validate.go -- checking that might not be version-independent and is definitely not part of the schema we are downloading.

If we want to validate against multiple spec versions, we should curate which versions we test against (to make sure that we don't provide incorrect results for a spec version we didn't test an old version against -- after all this tooling is going to be part of the conformance testing eventually from what I understand).

You're right that we can do "preliminary checking" but I honestly don't see what the benefit is. It feels like it would be more confusing than it is helpful.

cyphar avatar Oct 26 '17 19:10 cyphar

On Thu, Oct 26, 2017 at 12:02:37PM -0700, Aleksa Sarai wrote:

I think the long-term goal is to be able to test against multiple spec versions.

I'm not sure that this is something that makes a lot of sense to do "automatically" by just downloading a schema file from the internet and then trusting that the validation is correct. There's already a lot of additional checking we do in validate.go -- checking that might not be version-independent and is definitely not part of the schema we are downloading.

Absolutely, which is why you will continue to get the:

validate currently only handles version …, but the supplied configuration targets …

error 1. The value added for future versions is that you might also get additional errors if you break a JSON Schema requirement. And the JSON Schema covers a lot of stuff, even if it's not covering everything.

If we want to validate against multiple spec versions, we should curate which versions we test against…

I am in favor of local tests to make sure we continue to make the expected (in)valid ruling on 1.0.0, etc. configs and runtimes as runtime-tools grows support for later versions. I don't think that's an argument for immediately throwing up our hands if we are asked to validate a newer version, when we can perform the JSON Schema check even without code specific to the new version.

wking avatar Oct 26 '17 20:10 wking