image-spec
image-spec copied to clipboard
config.rootfs is superfluous and should be removed
I think config.rootfs effectively[*] duplicates the manifest.layers information available elsewhere, and thus config.rootfs can/should be removed from the image spec. This makes image generation slightly simpler, removes an extra checksum calculation step during unpacking, and makes the image config object portable across minor image versions (one more cacheable thing).
[*] The values are not strictly identical, but the DiffIDs can be easily calculated from the manifest layer tar.gz.
Specifically, I propose just demoting rootfs from application/vnd.oci.image.config.v1+json from "required" to "ignored", which will be backwards compatible. history can remain as an optional annotation, although I note it appears highly out of place without rootfs, and probably belongs in some layer or manifest metadata instead.
If a runtime or other tool requires a value similar to the rootfs DiffID, this can be calculated during rendering from the image layers.
Note also that with this change, the sections discussing DiffID and ChainID calculation can be removed from the image config spec, since DiffID+ChainID are no longer relevant terms anywhere in the oci image spec (these concepts may be useful to runtimes, so perhaps might be moved there).
(This is also a question phrased as a bug report. I'd like to know why we need the rootfs DiffIDs, and why it should be in the config object, as opposed to manifest (with the rest of the layer info). The rootfs inclusion in config was unexplained and surprising to me from reading the spec documents - and indeed the ChainID discussion seems to be an obsolete section, no longer referenced by the actual schema)
Not only that, it is relatively expensive to compute (e.g. converting from Docker schema 1).
After a lengthy irc discussion, I've learned:
- @stevvooe is very patient :)
- The image config includes
rootfsbecause- the historically important runtime implementation used the hash of this config as a runtime image identifier - and thus historically it needed to be tied to the runtime image contents directly in some way
- the OCI image spec has inherited this historical overlap between image+runtime config.
- Yes,
rootfscould be removed and calculated on the fly at unpack time from the original layer tarballs, and in fact the DiffID is already independently recalculated in order to validate it against the config. - From a standards POV: Removing
rootfsmeans introducing a new version of the config json, to avoid breaking old clients that expect this field to be present.
The last point is the important one, and this trivial improvement is clearly not worth that level of churn. I expect this means this issue will sit on the pile for some time awaiting a larger, cumulative update to the config spec...
(@stevvooe: let me know if I've summarised our long conversation incorrectly - I'll edit as needed)
@anguslees Thanks for summarizing!
I understand some things about this specification are non-ideal but we can't let that get in the way of working software. As more implementations migrate to being able to handle digest/mediatype resolution of container image components, this is a transition that will be easier to do in the future.
I agree this adds an expense for diffID calculation, @mattmoor . Likely, a smoother next step could be relaxing the MUST on rootfs.type = layers, so that a future value and approach could be used. Regardless, for your ask of specifically demoting rootfs to being ignored, I don't think that is the ask here anymore. Should this issue get closed?
The suggestion is still to drop rootfs to ignored. The recognition above is that that will break existing clients and so requires some sort of deprecation/notification process.
Should this issue get closed?
I guess that depends on the scope of this GitHub project. If the point is to track issues with the spec that can be included in some future revision, then I guess it should stay open. If changes to the spec are managed through some other process and out of scope for GitHub issues, then yes, we should close it (and move to that other forum).
@anguslees @stevvooe I cannot find any references to
the sections discussing DiffID and ChainID calculation can be removed from the image config spec, since DiffID+ChainID are no longer relevant terms anywhere in the oci image spec
could you please clarify what you mean? Sorry to revive this old issue
See https://github.com/opencontainers/image-spec/blob/v1.1.0/config.md
@tianon I understand the idea of ChainID, I don't understand where the following was pointed out
since DiffID+ChainID are no longer relevant terms anywhere in the oci image spec
@anguslees @stevvooe I cannot find any references to
the sections discussing DiffID and ChainID calculation can be removed from the image config spec, since DiffID+ChainID are no longer relevant terms anywhere in the oci image spec
could you please clarify what you mean? Sorry to revive this old issue
There is a section in the spec that describes a "Layer ChainID".
This ChainID is not referred to elsewhere in other specs, and afaict from much my earlier conversation is just a dockerd internal implementation historical point of interest. It serves no purpose in the standard, and I suggest this whole "Layer ChainID" section can be removed immediately - although it is obviously not worth publishing a new spec revision just for this minor textual cleanup.
Note the above ChainID reference was a minor aside on this bug report. This bug report is that the entire config.rootfs section is unnecessary and (I propose) can be removed, if/when we publish the next version of this spec. At that point, the DiffID section (in addition to ChainID) can also be removed.
I'm fine with removing ChainID, but not the config.rootfs section. That would result in a non-unique Image ID, breaking downstream consumers including Docker and Kubernetes.
Sorry
I'm fine with removing ChainID, but not the config.rootfs section. That would result in a non-unique Image ID, breaking downstream consumers including Docker and Kubernetes.
It's been a while since I looked at the specs in enough detail ... but is this still true in our new containerd world?
I think these systems use manifest digest in all the important (exposed to the user) places, and if they use imageID internally somewhere, then it can (and arguably should!) be replaced with manifest digest or other reference.
I agree we would need to make it known that config digest is no longer unique. I think that means that ImageID should also be removed from image-spec, when we go ahead with this hypothetical new spec revision.
To be super-clear: yes, removing rootfs, and removing ImageID would be a breaking change, reserved for a future revision of image-spec - as mentioned earlier above. I am not proposing to sneak this into the existing version without a deprecation/notification process.
It's been a while since I looked at the specs in enough detail ... but is this still true in our new containerd world?
See https://github.com/opencontainers/image-spec/pull/1173
It's been a while since I looked at the specs in enough detail ... but is this still true in our new containerd world?
containerd still uses the rootfs section to identify the snapshots used by containers. The image ID is also used by the CRI plugin and there is value in keeping the image ID as referring to the single runnable image rather than to an index of multiple runnable images.
a smoother next step could be relaxing the MUST on rootfs.type = layers
We should do this otherwise any innovative work to move away from the layer model will require having a "non-compliant" image config. The rootfs section should not be removed though, an alternative way to uniquely identify the container rootfs will still be relevant.
We should do this otherwise any innovative work to move away from the layer model will require having a "non-compliant" image config. The rootfs section should not be removed though, an alternative way to uniquely identify the container rootfs will still be relevant.
Could an innovation away from the layer model use a different config media type? If so, the config.md wouldn't apply and the spec language for the manifest.md has a looser recommendation that gets away from the MUST language:
If this image manifest will be "runnable" by a runtime of some kind, it is strongly recommended to ensure it includes enough data to be unique (such as the rootfs and diff_ids included in application/vnd.oci.image.config.v1+json) so that it has a unique ImageID.
I would expect it to use a different media-type, but the main takeaway I had from the whole artefacts saga is that public registries are so strict about media-types (which we made up :thinking:...) that any innovation requires abusing parts of the spec to sneak things past registries. Otherwise you won't be able to get things working "well enough" to start working on spec changes.
Maybe I'm off-base, but that was my impression. :weary: