artifacts icon indicating copy to clipboard operation
artifacts copied to clipboard

OCI artifact manifest, Phase 1-Reference Types

Open SteveLasker opened this issue 4 years ago ā€¢ 46 comments

PR Status

On the July 21, 2021 OCI call, and additional OCI TOB discussion, the following plan of action was decided:

  • Over the next few weeks the OCI TOB will vote on TOB#99, then a modified version of #96 to reflect the finalized OCI Working group template.
  • While OCI TOB finalizes the working group process, the implementation of Artifacts PR#29 will take place under the artifacts-spec repo that has been created under the oras-project. -The artifacts-spec project README reflects the intent that the project will be proposed to be onboarded to the OCI once the working group process is defined.
  • To avoid OCI branding and trademark concerns, the artifacts-spec will use oras mediaTypes and oras paths for apis, avoiding dependencies or conflicts to the distribution-spec based apis.
  • Once OCI defines a working group process that enables the collaboration of the artifacts-spec working group, onboarding of the artifact-spec repo to OCI can begin.

I'm leaving this PR open, and intact with the current files, to preserve the comments. We continue to implement and take input under oras-project/artifacts-spec


The OCI artifact manifest generalizes the use of OCI image manifest, by reducing the constraints on all artifacts, enabling specific artifact-specs to set constraints for their type. Phase 1 adds support for artifacts to reference other artifacts through a subjectManifest property enabling reference graphs, as those required for secure supply chain efforts.

Phase 1: Reference Types

The PR focuses on Phase 1, enabling reference type support in 2021, supporting secure supply chain artifact types including signatures and SBoMs.

Phase 2 Generic Artifact Versioning Support

Phase 2 will focus on the scenarios outlined in PR #37.

By splitting these out into phases, we can reduce the scope, for 2021, while providing time for phase 2 to evolve.

See: artifact-manifest.md for the overview of content, and artifact-manifest-spec.md for spec details.

Signed-off-by: Steve Lasker [email protected]

SteveLasker avatar Feb 10 '21 02:02 SteveLasker

This is looking really good. I am still trying to figure out how to tie in use cases that aren't attached to a specific image manifest, but instead the entire repository. Examples that come to mind are TUF targets and snapshots that represent the current state of all known signed images in a repository. Another example could be repository metadata of when it was created, who owns the repo, number of stars, number of pulls, etc.

Ideally, I'd like to have a way to query for these that doesn't conflict the the image tag namespace. If there's a way to query for an artifact by type, but without specifying the attached image digest, I think we'd have a solution.

sudo-bmitch avatar Feb 10 '21 12:02 sudo-bmitch

I am still trying to figure out how to tie in use cases that aren't attached to a specific image manifest, but instead the entire repository

Due to the high concurrency of content pushed/pulled to a registry, I don't believe we have a design to handle this. I'm also not sure we have a requirement.

Another example could be repository metadata of when it was created, who owns the repo, number of stars, number of pulls, etc.

This is yet another round of updates I'm hoping we can layer in, once we get past the new OCI Artifact Manifest discussions. See Adding Metadata Services to OCI Distribution-Draft for some initial thoughts. It would account for registries serving [read-only] content, such as pull count, "stars upon thars". I suspect the meta-data queries will come into the list API requirements as well. See Show/Get-Info API Requirements #232-Data Returned

SteveLasker avatar Feb 10 '21 17:02 SteveLasker

Is there any resolution for @jonjohnsonjr's suggestion on using the OCI index to map references? Something like:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "size": 7143,
      "digest": "sha256:0228f90e926ba6b96e4f39cf294b2586d38fbb5a1e385c05cd1ee40ea54fe7fd",
      "annotations": {
        "org.opencontainers.image.ref.name": "stable-release"
      }
    },
    {
      "mediaType": "application/vnd.cncf.notary.v2+json",
      "size": 7143,
      "digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f",
      "references":[
         {"type": "signature",
          "artifact": "sha256:0228f90e926ba6b96e4f39cf294b2586d38fbb5a1e385c05cd1ee40ea54fe7fd"
          }]
      }
  ],
  "annotations": {
    "com.example.index.revision": "r124356"
  }
}

nishakm avatar Mar 09 '21 19:03 nishakm

We feel it's best to move forward with the proposal in this PR to decouple from image.manifest and image.index. Using the new oci.artifact.manifest provides a clear definition for the references required for Notary & SBoMs. It also allows us to eventually add support for weak references, as sketched in #27

SteveLasker avatar Mar 09 '21 19:03 SteveLasker

We feel it's best to move forward with the proposal in this PR to decouple from image.manifest and image.index. Using the new oci.artifact.manifest provides a clear definition for the references required for Notary & SBoMs. It also allows us to eventually add support for weak references, as sketched in #27

There is some overlap between references and SPDX relationships. It seems to me that this could be useful here. Or maybe it's overkill and all we need is something describing an undirected/directed and mandatory/optional references.

nishakm avatar Mar 09 '21 20:03 nishakm

These manifests (oci.image, oci.index. oci.artifacts) are very coupled to how content is stored in a registry, enabling content discovery, acquisition and eventual cleanup. Storing documents like SPDX, 3T-SBoM or others also makes sense as they are content that just happens to be in a registry. Mixing different manifest types can confuse things. This is why I keep going back to what requirements are we trying to solve.

SteveLasker avatar Mar 09 '21 20:03 SteveLasker

It would help me understand what's being proposed if you could revisit the OCI Artifact Manifest Properties section a bit.

I'd like to see a really rigorous description of these fields, similarly to how index and image are defined. Specifically, I want to understand what they mean.

Your current descriptions are really abstract and don't really describe the actual semantics of the fields. Let's separate out the format semantics from your expectations of how registries handle these so that we can discuss those individually. You also introduce a concept of "Extension artifacts" without defining what that is.

I have a feeling that you don't really care about the artifact format -- you actually care about the semantics of the relationships between artifacts. If I'm right, I would suggest that defining a new artifact is a terrible idea, and that what you actually want is to augment the properties of a descriptor such that we can express new kinds of relationships.

Your current proposal seems to be limited in that only new artifact manifests are allowed to have these new kinds of relationships, which seems inflexible and less powerful than enhancing the existing relationship abstraction we already have (the descriptor). I'd like to be able to express these kinds of things using existing formats and new formats. I don't want to have to invent a new format to express any other kinds of relationships we come up with.

jonjohnsonjr avatar Mar 09 '21 21:03 jonjohnsonjr

I just did a presentation on the new OCI Artifact Reference types and their supported scenarios and needs. The deck is here. As the videos are uploaded here I'll update with the specific link.

I have a few Notary v2 and ORAS updates to complete for Notary prototype-2. After that, I'll convert the current examples to an actual spec oci-artifact-manifest-spec.md, identifying the specifics you and others have been asking for.

For example:

  • [manifests] references must be in the same repo as they extend another artifact.
  • [manifests] entries are optional. Individual artifacts, like OPA or even the current implementation of Helm and CNAB could use this new manifest and simply not use [manifests] until they need them.
  • Artifact that uses oci.artifact.manifest, and includes a [manifest] entry, are subject to deletion when their referenced manifest is deleted. If you delete net-monitor:v1, all the Notary v2 signatures and associated SBoMs would be deleted. (ref counted -1)

We have been through several rounds of discussions for changing the descriptor or one of the existing manifests. These were all non-starters, with lots of filibustering. Rather than thrash existing schemas, implying a lot of instability to tooling that's already making lots of assumptions about the current manifests, we're focused on the new manifest to address the new needs. Since it's a superset of image.manifest, there's nothing stopping the current image tools from adopting it. It could be the basis of the versioning problem we're having with any changes to image.manifest.

SteveLasker avatar Mar 09 '21 21:03 SteveLasker

The deck is here.

Is this available in a less hostile file format?

filibustering

sigh

Since it's a superset of image.manifest

I don't believe you understand what superset means.

It could be the basis of the versioning problem we're having with any changes to image.manifest.

This does not solve any problems with versioning. It's just a new version. There aren't any proposed mechanisms for how to change it that differ in any way from what we have today, as far as I can tell.

jonjohnsonjr avatar Mar 09 '21 21:03 jonjohnsonjr

Regarding requirements: What are they exactly? This is what I have been able to grok thus far:

  • We want to store artifacts that are related to a container image (signatures, SBoM, supplemental artifacts, etc)
  • We want to store artifacts that reference one or more container images edit: and their related artifacts (Helm charts, CNAB, k8s deployments, etc)
  • We want all of these collections of related and referenced artifacts to be movable from registry to registry without changing their relationships

From the garbage collection point of view, it makes sense to me that there needs to be a "root" that has all the connections to all of the artifacts, and OCI index seems to be a good candidate for it. But I can also see the need for something that describes all of these artifacts and their relationships and this is where the SBoM can actually help. Things like Helm charts and CNABs can have their own SBoM that describes all the related and required artifacts such as the container images and the signatures for the container images.

Regarding the digest of index.json, I don't think this is a problem. Folks want to know what changed and where in the artifact tree the change happened. IMHO, the digests are the versions.

nishakm avatar Mar 09 '21 22:03 nishakm

Your current proposal seems to be limited in that only new artifact manifests are allowed to have these new kinds of relationships, which seems inflexible and less powerful than enhancing the existing relationship abstraction we already have (the descriptor). I'd like to be able to express these kinds of things using existing formats and new formats. I don't want to have to invent a new format to express any other kinds of relationships we come up with.

IIRC, there were some concerns on allowing arbitrary content descriptors with regards to backwards compatibility with existing client tools. Initially, I had looked at content descriptors to describe things and their relationships. Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

Personally, I think there is nothing stopping registries from being instantiated as an "everything else" storage solution like bundle.bar and creating a whole distributed thingy around that, including a new artifact merkle DAG that has nothing to do with the image spec.

nishakm avatar Mar 10 '21 01:03 nishakm

Initially, I had looked at content descriptors to describe things and their relationships.

I do this all over the place, and it's a good pattern. The content descriptor is a generic and useful abstraction, even outside of OCI, and I've been trying to get more people to adopt it instead of inventing new stuff.

Personally, I think there is nothing stopping registries from being instantiated as an "everything else" storage solution like bundle.bar and creating a whole distributed thingy around that, including a new artifact merkle DAG that has nothing to do with the image spec.

This is exactly how the registry is designed and works today. I'm fine with creating a new kind of generic node in the DAG if we think we need one, but defining the semantics of that will be tricky. As far as I know, all registries today are "strongly typed" in that they only know how to parse a small number node types (by their mediaType, as indicated in the Content-Type header): image and index.

Index is a list of pointers, so you can implement any kind of graph you want -- if you squint and think about Lisp, this is really powerful.

Image is a list of pointers + a special pointer. This is convenient, but not any more powerful than an index, really.

One unfortunate reality of dealing with registries in the wild is that there are vastly different interpretations of the image and registry specs, especially around garbage collection and what an image or index is allowed to reference. Can images only reference blobs? Can indexes reference blobs, or just manifests? What do we do if the registry doesn't understand a media type of a descriptor within a manifest? Should we just ignore it? Assume it's a blob? Assume it's a manifest? Are blobs and manifests in the same CAS namespace, or should those be treated separately -- e.g. if I push something through /manifests/ should it be readable through /blobs/ -- vice versa?

I've had a couple ideas around this (off topic but we can get into that if anyone is interested), but they would require registry operators to all agree on some semantics that are currently undefined and with mutually incompatible implementations :(

This is one reason I really want Steve to spell out the semantics of these new artifact types. Up until this point, we haven't defined anything about ref counting or garbage collection expectations. This new artifact type introduces requirements around that, so we need to address the baseline expectations of registries if we're going to layer on top of them. It doesn't make sense to define a weak reference if we don't also define a strong reference, or at least contrast the weak reference with "every other kind of reference is undefined behavior and registries can do whatever they want".

Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

I've brought up ~two separate concerns around backward compatibility, and I don't think I've done a great job of expressing my points, so let me try to clarify:

  1. If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change anything and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.
  2. If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

I think we've gone past the first point and into the second point now, since registries will need to maintain or produce an inverted index for weak references. As I've said before, weak references and inverted indexes would be generally useful constructs for other artifacts, and I think they should be pulled out of this massive, confused proposal so that we can talk about the best way to go about implementing them in isolation.

I have a huge problem with just adding another artifact type and defining entirely new semantics for only that artifact type because it doesn't fit into the existing design of OCI data structures at all. We also ran into a similar problem with foreign layers, which I believe similarly landed in docker and OCI by fiat from Microsoft because it was a business requirement. It doesn't fit into the model, doesn't compose with other abstractions, is completely under-specced, and is a huge source of bugs -- they even have a CVE!

I'll try to explain again my issue with this, abstractly, in terms of boxes and arrows:

The current proposal defines a new type of box that is very slightly different in shape from the existing boxes, but the primary feature of this new type of box is that it has a new kind of arrow, even though those arrows are defined in the exact same way as arrows coming out of other boxes, and look identical, so there's no indication that they should be treated differently outside of the definition of the box. Also, only some of the arrows coming out of the new box are of the new kind.

image

At this point I don't really care about stopping Steve from defining a new artifact type. I think it's a bad idea, but my primary goal is just to make the design of the new mechanism not bad. These dashed arrows shouldn't be specific to an artifact manifest. We have already formally specified the behavior of arrows. Why can't we make "dashed" a property of an arrow instead of a property of the box that contains the arrow? The Descriptor definition specifically calls out that it should be considered for extension before doing format-specific things:

Extended Descriptor field additions proposed in other OCI specifications SHOULD first be considered for addition into this specification.

jonjohnsonjr avatar Mar 10 '21 17:03 jonjohnsonjr

Initially, I had looked at content descriptors to describe things and their relationships.

I do this all over the place, and it's a good pattern. The content descriptor is a generic and useful abstraction, even outside of OCI, and I've been trying to get more people to adopt it instead of inventing new stuff.

This section probably needs more examples then. I don't quite understand how This section defines the application/vnd.oci.descriptor.v1+json media type. and mediaType string: This REQUIRED property contains the media type of the referenced content relate.

Unfortunately, "backwards compatibility" seems to be the de-facto reason for not including something in the spec so my recollection may be faulty.

I've brought up ~two separate concerns around backward compatibility, and I don't think I've done a great job of expressing my points, so let me try to clarify:

1. If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change _anything_ and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.

I thought this was not possible as existing clients will either try to spin up a set of blobs when they shouldn't or barf when encountering a manifest layout they do not understand.

2. If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

I'm not sure existing clients are capable of addressing supplemental or related artifacts. However, index.json sounds like it's capable of accommodating an "artifacts" manifest as Steve has described. The relationships/references thing can be discussed some more. My other concern with the content descriptor is the requirement to adhere to IANA descriptors. I suppose one could just use json, but I am still unsure how to actually use them šŸ˜….

Unfortunately, most of my concerns around this proposal aren't really captured by the notary requirements, so it's hard to argue with Steve who will only consider concerns valid if they can be mapped directly to a notary v2 requirement.

I think other folks also have the need to be able to reference supplemental artifacts to verify supply chain integrity, provenance, etc. The spec, as it is, doesn't meet the base 3 requirements I had listed above. Can we start there instead?

nishakm avatar Mar 10 '21 22:03 nishakm

If it's possible to adapt your use case to work with existing clients and registries such that we don't have to change anything and everything continues to work, we should do that. This was roughly the conclusion of the OCI Artifacts stuff, I believe.

Artifacts "v1" was really about formalizing what people were already doing: stuffing additional content types in a registry, and just making them look like images, by using the same mediaTypes of an oci.image. While it was easier to identify the type through a formal manifest.artifactType property, it was felt to be too risky to make a breaking change to the schema, and we could just use manifest.config.mediaType. So we did.

The new oci.artifact.manifest supports a new reference type. To your point, these are considered strong references. the Weak references (#27) were deferred, for now. If the referenced artifact under [manifests] is deleted, the artifact referencing it should also be deleted (ref count -1). I'll get this written up in the oci-artifact-manifest-spec.md next week.

If we really need to add new functionality to clients or registries to support a new use case, let's do it in the least disruptive way possible:

The new oci.artifact.manifest is new, but not intended for the existing clients. In fact, it's explicitly avoiding the existing clients as a new manifest.mediaType, to assure we can innovate without breaking compact.

I think we've gone past the first point and into the second point now since registries will need to maintain or produce an inverted index for weak references. As I've said before, weak references and inverted indexes would be generally useful constructs for other artifacts

Yes, we will need a new index, which registry operators can choose their specific implementation. Just a minor point of clarity, as I'd like to think of these as strong/hard references. When you post an oci.image.manifest, the digests of the manifest must already exist in the registry/repo. If not, the manifest put fails. This will be the same for entries in [manifests]. It would not be the case with [references] as defined in the punted #27 proposal.

I think they should be pulled out of this massive, confused proposal

What is massive and confusing?

The new manifest is pretty straightforward. It's a new manifest to decouple from image-specific scenarios. This frees up OCI Image v2, and allows artifacts, which could be images, to evolve cleanly.

  1. A new manifest.artifactType property to decouple from manifest.config.mediaType
  2. [layers] renamed to [blobs]
  3. [manifests] collection for "hard links" to existing manifests in the same repo.

image

At this point I don't really care about stopping Steve from defining a new artifact type. I think it's a bad idea, Unfortunately, most of my concerns around this proposal aren't really captured by the notary requirements, so it's hard to argue with Steve who will only consider concerns valid if they can be mapped directly to a notary v2 requirement.

I'm mapping designs to meet requirements. Notary, SBoM, GPL Source, Nydus and other artifact types benefit from these. So, yes, these designs do map to requirements, not just Notary. If Notary v2 isn't adopted, these enhancements have value unto themselves. So, I'm not really sure what you're objecting to.

Usable workflows, enabled for the masses to easily create and consume Notary v2 signatures

We've incorporated a lot of great feedback, including the flow to push the image as a digest, push the signature, then do the tag update, so I think we're incorporating all relevant and actionable feedback. We've also demonstrated pretty clean workflows (nv2 demo script and nv2 video, so I'm still not sure what you're objecting to, or even what you're proposing. There's just a lot of debate. You don't have to agree. That's the beauty of opinions and extensions. You don't have to agree or even implement them.

The spec, as it is, doesn't meet the base 3 requirements I had listed above. Can we start there instead?

Can you list the 3 requirements?

SteveLasker avatar Mar 11 '21 00:03 SteveLasker

Can you list the 3 requirements?

  • We want to store artifacts that are related to a container image (signatures, SBoM, supplemental artifacts, etc)
  • We want to store artifacts that reference one or more container images and their related artifacts (Helm charts, CNAB, k8s deployments, etc)
  • We want all of these collections of related and referenced artifacts to be movable from registry to registry without changing their relationships

I'm going to add a 4th one here: We need to be able to append artifacts based on their relationships

nishakm avatar Mar 11 '21 14:03 nishakm

Thanks @nishakm, All 3 are covered in this proposal. The PR has some examples manifests, for a signature and SBoM here

Below is an image that shows how the individual artifacts are linked together:

  1. net-monitor:v1 image
  2. 3 inked signatures of the net-monitor:v1 image
  3. An SBoM, linked to the net-monitor:v1 image
  4. A signature of the net-monitor:v1 SBoM
  5. Yet Another Artifact Type (YAAT), linked to the SBoM
  6. A signature of the net-monitor:v1- SBoM - YAAT.

All the downward arrows are represented by the existing manifests, and the config and [blobs] collection of the oci.artifact.manifest. The upward arrors represent the entries in the new [manifests] collection.

image

The target experience we're shooting for with the Notary prototype-2 is sketched here

ORAS will be used as a CLI, for demonstration purposes, but ORAS and nv2 will also provide libraries, so you can build this docker type experience

SteveLasker avatar Mar 11 '21 17:03 SteveLasker

Details on the oci.artifact.manifest spec provided. Including a change from manifests to references.

SteveLasker avatar Mar 18 '21 04:03 SteveLasker

I'm on the fence between using [manifests], [references], [manifest-refs] or something else. The intent is a collection of manifests, as OCI artifacts can refer to other manifests. It's not intended to refer to other blobs. While it dupes the name of manifests in the OCI Index, that's actually ok, as they both are a collection of manifests. The difference is the OCI Index is a "downward" collection of manifests that make up a thing, pivoted on platform/arch. While the OCI Artifact manifests are a reverse ("upward") reference to manifests, to extend their data.

The other thing to notice in this manifest is it's a subset of the oci-image restrictions. The intent dates back to the refactoring of various artifact types. Distribution supports all types of artifacts, based on a few manifests. OCI Artifacts is the means to generically define how something can be structured, to be stored. Then, you have various Artifact specs, including the image-spec, that take advantage of the various manifests.

The setup here is the image-spec could be a more narrowly defined use of the oci.artifact.manifest spec as it provides a superset of capabilities, with a subset of constraints. It also has clearly defined versioning semantics.

image

SteveLasker avatar Mar 18 '21 15:03 SteveLasker

Is there an understood clear path for getting this merged/accepted/ratified? Since it's a brand new spec I'd guess that it would need to be approved by the full OCI TOB at some point, is that correct?

I'm trying to understand where this is in the process of going from draft to something that might be supported widely. What steps/approvals are left before registries would start implementing this?

dlorenc avatar Mar 21 '21 12:03 dlorenc

Latest update accounts for;

  • zero [blobs] support clarification from @sudo-bmitch
  • updates the signature examples to remove the config entry
  • moves the [references] array back to [manifests] to clarify this is a collection of manifest descriptors. Other names might be [manifest-refs] (naming is hard), but lets agree on the definition/behavior, and the name will likely fall out.

SteveLasker avatar Mar 24 '21 16:03 SteveLasker

We're doing some active validation of the oci.artifact.manifest spec in the Notary v2 working group: The lasted update add support, including /v2/_ext/oci-artifacts/v1/<repo>/manifests/<digest>/links?artifact-type=xyz to enable linked artifacts discovery

While this work will continue validations, we'd like to start putting šŸ‘€ on a newer proposal that solves the linked artifact references, and general versioning problems we've had with the image-spec. See WIP generic object spec #37

SteveLasker avatar Mar 29 '21 23:03 SteveLasker

Was that last commit an accident? It looks like it was supposed to go here: https://github.com/notaryproject/artifacts/tree/prototype-2

dlorenc avatar Apr 05 '21 14:04 dlorenc

Just some doc updates while it's in draft mode.

SteveLasker avatar Apr 05 '21 16:04 SteveLasker

Homebrew https://brew.sh is a package manager that supports both macOS and Linux. We have binary packages (called bottles), which are tarballs, for multiple versions of macOS and one universal Linux bottle that works on all distributions. We store each bottle in an ORAS artifact in an image manifest. We bundle these image manifests up into a single image index. We use the .manifests[].platform object, which includes architecture, variant, os, and os.version to select which bottle to download. See https://github.com/opencontainers/image-spec/blob/master/image-index.md#image-index-property-descriptions

You can see examples of these ORAS image indexes at https://github.com/orgs/Homebrew/packages/container/package/core/hello and https://github.com/orgs/brewsci/packages/container/package/bio/seqkit

We use the media types application/vnd.oci.image.index.v1+json and application/vnd.oci.image.manifest.v1+json and even application/vnd.oci.image.layer.v1.tar+gzip for compatibility with oras, skopeo, even docker, though the Docker "image" can't be run due to not having the necessary dependencies included.

I'm not at all familiar with the proposal in this PR, and wasn't familiar with it when we came up with this solution. It sounds related though, and just wanted to share how we tackled this related issue.

Postscript

$ docker run ghcr.io/brewsci/bio/seqkit:0.15.0 seqkit/0.15.0/bin/seqkit version
seqkit v0.15.0

Incidentally, the image ghcr.io/brewsci/bio/seqkit:0.15.0 can be run, because it's a static executable with no dependencies, but that's not generally true of Homebrew bottles stored on GitHub Container Registry.

sjackman avatar Apr 28 '21 22:04 sjackman

Thanks @sjackman, The multi-arch angle to get multi-arch binaries is pretty cool. I'm curious why you stayed with the container image mediaTypes, vs. defining your own, vnd.brew.*? Is this because you can run them as container images with docker run? Or, because docker hub hasn't opened the mediaTypes yet?

SteveLasker avatar Apr 28 '21 23:04 SteveLasker

I'm curious why you stayed with the container image mediaTypes, vs. defining your own, vnd.brew.*? Is this because you can run them as container images with docker run? Or, because docker hub hasn't opened the mediaTypes yet?

Primarily to support uploading these image indexes using skopeo, so that we didn't need to reinvent that particular wheel. We store the images on GitHub Package Registry, so limitations of Docker Hub weren't a primary concern, although it's a bonus if the images can be stored on multiple registries. Downloading the images works with skopeo, oras, and even docker, though the Homebrew client just uses curl.

sjackman avatar Apr 28 '21 23:04 sjackman

Primarily to support uploading these image indexes using skopeo

Gotcha, so if skopeo supported flexible manifest.config.mediaTypes, that would enable you to identify the type in a registry, differentiating it from other types. Image-index wouldn't care what the config.mediaType is for the platform specific manifest.

t's a bonus if the images can be stored on multiple registries.

Docker Hub is actually the only registry I know of that doesn't support expanded mediaTypes. It's something they're working on.

SteveLasker avatar Apr 28 '21 23:04 SteveLasker

Skopeo may actually support different manifest.config.mediaType. I don't believe we tested precisely that. In the end we went with "annotations": { "com.github.package.type": "homebrew_bottle" } to distinguish Homebrew bottles from other images.

$ curl -s -H 'Accept: application/vnd.oci.image.index.v1+json' -H 'Authorization: Bearer QQ==' https://ghcr.io/v2/homebrew/core/hello/manifests/2.10 | jq -r '.annotations."com.github.package.type"'
homebrew_bottle

sjackman avatar Apr 28 '21 23:04 sjackman

Iā€™m actually interested in looking into the possibility of making the Homebrew Docker images generally usable by docker run, by including their dependencies as layers, and perhaps one more layer for an OS if needed. It's just an idea right now, but it ought to work.

sjackman avatar Apr 28 '21 23:04 sjackman

by including their dependencies

Yup, this is the core of the manifest reference types in this PR. Package A depends on B & C However, Package B & C are also independently pullable. By having each defined as an artifact, you can declare dependencies between them.

Using the oci.artifact.manifest, and eventually #37, you can declare package A has a manifest reference to B.

By storing these as independent artifacts for each package type, you're not limited to a package having a single layer and all the annotations, signing and other aspects are maintained.

The multi-arch angle is just as interesting as you can declare platform-specific manifests, with the index pivoting on the platform.

The idea behind using the oci.image.manifest.config.mediaType, or the manifest.artifactType in this PR, is registries, security scanners, CLIs don't have to read specific artifact type annotations to understand it's a bottle vs. something else.

Here's some examples for using the mediaType, vs. annotations: https://github.com/opencontainers/artifacts/blob/2c9db9b2da2a357307e7043bb9142327dbdda0ca/authoring-artifacts.md

Buried in this PR is an early version that needs to be revived where you can specify the logo, localized strings that registries or clis could display when they encounter your artifact type. Compare to the way a filesystem knows what icons and actions to present based on the file extension: https://github.com/opencontainers/artifacts/blob/2c9db9b2da2a357307e7043bb9142327dbdda0ca/authoring-artifacts.md#defining-the-artifact-type

SteveLasker avatar Apr 29 '21 00:04 SteveLasker