oras icon indicating copy to clipboard operation
oras copied to clipboard

Revisit the Design of ORAS

Open shizhMSFT opened this issue 3 years ago • 2 comments

Originally, ORAS is designed to upload any files to the registry, mapping the file to the layer blob. Therefore, the remote layer blob can be downloaded to the local disk and renamed according to the org.opencontainers.image.title in the annotation field without further operations like untar, decompressing, or even sequential extraction.

Example manifest:

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2,
    "annotations": {
      "hello": "world"
    }
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar",
      "digest": "sha256:22af0898315a239117308d39acd80636326c4987510b0ec6848e58eb584ba82e",
      "size": 6,
      "annotations": {
        "fun": "more cream",
        "org.opencontainers.image.title": "cake.txt"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar",
      "digest": "sha256:be6fe11876282442bead98e8b24aca07f8972a763cd366c56b4b5f7bcdd23eac",
      "size": 7,
      "annotations": {
        "org.opencontainers.image.title": "juice.txt"
      }
    }
  ],
  "annotations": {
    "foo": "bar"
  }
}

Apparently, it does not support directories but files. How do we support directories? It is as simple as packing the directory into a tarball and mark it to be unpacked when pulled.

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.unknown.config.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:abb19a9ec5da5811e9cda1598dbded9f9344302aca2daf15fcb7d046e57aa473",
      "size": 787,
      "annotations": {
        "io.deis.oras.content.digest": "sha256:f34d88290cc95c33160cf2f6f3ed526b955dd14ec2a244ff834a2ab1bbf55c03",
        "io.deis.oras.content.unpack": "true",
        "org.opencontainers.image.title": "cache"
      }
    }
  ]
}

Then it leads to a question what if we blindly extract tar or tar+gzip blob without seeing "io.deis.oras.content.unpack": "true"? Well, it causes ambiguity that cache in the above manifest can be a directory and it can also be a file. Hence, those annotations in the manifests are necessary.

Taking the Helm chart pushed by the ORAS library as example:


{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.cncf.helm.config.v1+json",
    "digest": "sha256:8db342103d3991af6189b36a05bb25fb2412b80501f4ecb0feca042c8c688688",
    "size": 200
  },
  "layers": [
    {
      "mediaType": "application/tar+gzip",
      "digest": "sha256:2b34e2b5905dc47ce3a98625c2244b10d613d041dadd47c3b012188a468bd272",
      "size": 2862
    }
  ]
}

Unless the client has the context of Helm, it does not know how to deal with the layer blob with the media type application/tar+gzip.

  • It could be a file but it does not have a name.
  • It could be a directory but it has no idea where the client should extract to securely.

As a result, the ORAS cli has no means to pull the above Helm chart artifact although it is pushed via the ORAS library.


@SteveLasker proposed

  • #178

that we can always pack and compress the desired file before pushing to the registry so that all file metadata are kept in the tarball and thus make the annotations in the manifest clean.

However, that proposal has two major drawbacks:

  1. Backward compatiblity - the new packing method renders the artifacts pushed previously invalid.
  2. Non-straightforward digest and ambiguous ording - https://github.com/oras-project/oras/issues/178#issuecomment-813947002

Here is the next question: Can we enhance ORAS to take the advantages of both paradigms? The answer is probably yes.

We can classify the manifests into 3 categories:

  1. All layers have annotations.
  2. Not all layers have annotations, and those layers without annotations are all tarballs.
  3. Any manifest else does not fall into the above two categories.

For 1, we can pull it using the default ORAS behavior. For 2, we cannot download and process the layer blobs in parallel. Instead, we should download and process the layer blobs in a sequential order. The layer blobs with annotations are processed using the default ORAS behavior, and those unnamed tarballs are processed by extracting to the current working directory. For 3, ORAS can give the user an option to ignore those unnamed non-tarballs or process them. If the user choose to ignore them, we fall into catagory 1 & 2. Otherwise, the same method for catagory 2 is applied and those unnamed non-tarballs are downloaded and renamed according to their digests.

shizhMSFT avatar Sep 03 '21 16:09 shizhMSFT

For releasing 1.0 having a clear idea of all exposed types have to be agreed upon.

https://gist.githubusercontent.com/sajayantony/75bd90a7f3db4980b02b32bb6d23fd54/raw/e730fc31d6a87014f040f394289c904228af3601/diag.svg

sajayantony avatar Nov 02 '21 22:11 sajayantony

I think given the number of request to show manifests - I've started a discussion here - https://github.com/oras-project/oras/discussions/340 /cc @mnltejaswini @shizhMSFT @deitch @jdolitsky

sajayantony avatar Nov 16 '21 05:11 sajayantony

This issue was closed because it has been stalled for 30 days with no activity.

github-actions[bot] avatar Jul 29 '23 01:07 github-actions[bot]