image-spec Support the media type `application/vnd.oci.image.layer.v1.raw`

This defines a new MediaType that is the original file without any processing, and marks its path in the annotation. This will be very helpful for the large language model, because the model is just too big (such as 100 GB), and not packaging it would be a huge optimization.

This MediaType is not just for Kubernetes. It provides a standardized way to handle raw data efficiently for different container environments.

Faster bootstrapping: On HDDs devices, Unpacking layers can also slow it down. This step can be removed to speed up bootstrapping after the first image pull. This is good for places where things need to be up and running fast.

Storage Space Savings: This MediaType can save almost half of the storage space, which is crucial for devices with limited storage.

I expected something like this

COPY --type=raw ./file /file

...
   "layers": [
      ...
      {
         "mediaType": "application/vnd.oci.image.layer.v1.raw",
         "size": 10000000000,
         "digest": "sha256:"
         "annotations": {
            "org.opencontainers.image.layer.path": "/file"
         }
      }
      ...
   ]
...

User Story: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4639-oci-volume-source

Sep 03 '24 11:09 wzshiming

+1

application/vnd.oci.image.layer.v1.raw can remove the time-consuming compression and decompression of the AI Model weight. The compression rate of Gzip for the model is low, and the compression and decompression time can be increased.

Sep 03 '24 13:09 gaius-qi

Agreed with Brandon; if compression is the bottleneck, I'd suggest using uncompressed (application/vnd.oci.image.layer.v1.tar), especially as using a standard tar header on the data instead of a custom annotation adds other benefits like all the existing standard container ecosystem tooling being able to do something meaningful with the blobs (RUN --mount=type=image,..., the new k8s functionality, https://oci.dag.dev, etc).

Sep 03 '24 19:09 tianon

Thanks for your reply, I've updated my content accordingly and commented on the previous discussion for anyone who might be interested in moving over here.

I think this is a great optimization for large files in images. I look forward to a discussion that will lead to a more acceptable solution. 🙏🙏🙏

Sep 04 '24 03:09 wzshiming

How will we persuade all container runtimes to adopt this special layer format; how will we provide compatibility for runtimes that don't yet include that support?

Sep 04 '24 09:09 sftim

Alternative: define a new type OCI artefact (not a layer) that represents a chunk of application data. This can then be mapped to something that looks like a file and mounted where a container can access it. For example, using a CSI driver and Kubernetes.

Sep 04 '24 09:09 sftim

Alternative: define a new type OCI artefact (not a layer) that represents a chunk of application data. This can then be mapped to something that looks like a file and mounted where a container can access it. For example, using a CSI driver and Kubernetes.

You have a point, there seems to be a precedent for this approach. such as application/vnd.in-toto+json,

https://oci.dag.dev/?image=docker/dockerfile:1.5.1

crane manifest docker/dockerfile: 1.5.1 | jq .
{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.index.v1+json",
    "manifests": [
...
        {
            "mediaType": "application/vnd.oci.image.manifest.v1+json",
            "digest": "sha256:cd6383b1260aee71593cb70bdd44d50daf1ba142ed54191972fce08694ddfe35",
            "size": 839,
            "platform": {
                "architecture": "unknown",
                "os": "unknown"
            },
            "annotations": {
                "vnd.docker.reference.digest": "sha256:e7748818724fa5f622da18698f9f5b16e0f32e5a6b9af888fd84053eb48e9cfd",
                "vnd.docker.reference.type": "attestation-manifest"
            }
        },
...
    ]
}

https://oci.dag.dev/?image=docker/dockerfile@sha256:cd6383b1260aee71593cb70bdd44d50daf1ba142ed54191972fce08694ddfe35

crane manifest docker/dockerfile@sha256:cd6383b1260aee71593cb70bdd44d50daf1ba142ed54191972fce08694ddfe35 | jq .
{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "config": {
        "mediaType": "application/vnd.oci.image.config.v1+json",
        "digest": "sha256:083187ca1ac1b4eebff9e2c982bb462d947a34e6d93be8a7d33491cf933881c9",
        "size": 241
    },
    "layers": [
        {
            "mediaType": "application/vnd.in-toto+json",
            "digest": "sha256:4297014349dde3f52863838747b5a4bacf2058d0951dae5b48025741611496a7",
            "size": 39882,
            "annotations": {
                "in-toto.io/predicate-type": "https://spdx.dev/Document"
            }
        },
        {
            "mediaType": "application/vnd.in-toto+json",
            "digest": "sha256:ff81f2987309a567da898c358437b2943cc297007b99491a4ea13f014b90a449",
            "size": 14542,
            "annotations": {
                "in-toto.io/predicate-type": "https://slsa.dev/provenance/v0.2"
            }
        }
    ]
}

Sep 04 '24 10:09 wzshiming

The OCI guidance for packaging artifacts can be found at: https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md

Specifically, the image manifest covers how to set the media types on the layers already: https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage

To take the common made up artifact, let's assume I've shipped a cat picture as an artifact. Today, that would be packaged as an image/jpeg layer, and tooling would already exist expecting that layer type, breaking if any other non-image type is seen.

With that example, how does the cat picture artifact know the path that it should be mounted in every container? Some of those containers could be web servers, each serving content from a different path. Other containers could be a wasm tool converting the format. And other containers could be ML models validating or training on pictures. With all of these containers, I don't see how the artifact knows its path that applies equally to all of them.

If an artifact is so specific that it can only be used with one image, then an image should be created with that artifact as a new layer. The result would be much more portable. If gzip compression is the problem in that scenario, the layer can be zstd compressed, or uncompressed, using the current spec already.

Sep 04 '24 11:09 sudo-bmitch

especially as using a standard tar header on the data

I generally agree; however a downside of this is that the thin tar wrapper makes the blob different from the original input; it has a different checksum, size etc which obscures its linkage and provenance.

But I also think the AI model use case is one where OCI artifacts are just more appropriate; the fact that it's architecture independent data among other things argues for that.

Sep 04 '24 12:09 cgwalters

There are other approaches to address the core requirement and better aligned with existing workflows? https://github.com/opencontainers/image-spec/issues/1190

Sep 10 '24 18:09 rchincha

taken up on the K8s side. I don't think we should contort the definition of an OCI image to attempt to force them to accept what

On the Kubernetes side, this wouldn't go into core Kubernetes either - it's most likely to be an extension you could add to a cluster if you want to.

Sep 13 '24 09:09 sftim

I'm closing this out since it's been a few months and there don't appear to be any maintainers in favor of the change.

Dec 05 '24 17:12 sudo-bmitch

@sudo-bmitch I known that this was closed, noticed that you mentioned zstd compressed tar layers, what about just zstd compressed raw file, no tar, one file per layer, could that be considered as extension?

Mar 19 '25 19:03 s3rj1k

@sudo-bmitch I known that this was closed, noticed that you mentioned zstd compressed tar layers, what about just zstd compressed raw file, no tar, one file per layer, could that be considered as extension?

That's already a registered media type: https://www.iana.org/assignments/media-types/media-types.xhtml

Note that OCI does not typically direct runtimes on what they must support. Runtimes add capabilities and then once they are verified in production we standardize them for interoperability between different runtimes. In other words, we are typically a trailing spec. So you would first implement this in a runtime.

Mar 19 '25 19:03 sudo-bmitch

Note that OCI does not typically direct runtimes on what they must support. Runtimes add capabilities and then once they are verified in production we standardize them for interoperability between different runtimes. In other words, we are typically a trailing spec. So you would first implement this in a runtime.

Hmm, that makes sense, thank you

Mar 19 '25 19:03 s3rj1k

I'd also stress again that an uncompressed tar is just a binary header prefix on your data with filename, size, permissions, etc, and using that allows existing ecosystem tools to process and browse your data without any changes. Your data is otherwise verbatim, it just starts with a bit of metadata in a binary format that you'd otherwise likely end up adding in JSON annotations (effectively reproducing a tar header, but less efficiently). Please consider using an uncompressed tar header instead of raw data (compressed or otherwise).

Edit: see https://pkg.go.dev/archive/tar#Header for the exact fields

Mar 21 '25 03:03 tianon

I'd also stress again that an uncompressed tar is just a binary header prefix on your data with filename, size, permissions, etc, and using that allows existing ecosystem tools to process and browse your data without any changes. Your data is otherwise verbatim, it just starts with a bit of metadata in a binary format that you'd otherwise likely end up adding in JSON annotations (effectively reproducing a tar header, but less efficiently). Please consider using an uncompressed tar header instead of raw data (compressed or otherwise).

Edit: see https://pkg.go.dev/archive/tar#Header for the exact fields

True, still consider the case of streaming data on per-layer basis, one-file-per-layer compressed with ztsd or gzip is simpler to stream compared to a file wrapped into tar.

Mar 21 '25 10:03 s3rj1k

True, still consider the case of streaming data on per-layer basis, one-file-per-layer compressed with ztsd or gzip is simpler to stream compared to a file wrapped into tar.

Can you expand on that? Why is it simpler to stream data to a file on a filesystem using metadata in an external json format than with a tar header?

Mar 21 '25 11:03 sudo-bmitch

True, still consider the case of streaming data on per-layer basis, one-file-per-layer compressed with ztsd or gzip is simpler to stream compared to a file wrapped into tar.

Can you expand on that? Why is it simpler to stream data to a file on a filesystem using metadata in an external json format than with a tar header?

You won’t need to handle tar header in file itself codewise, just take binary blob directly and read it and unpack gzip/ztsd as stream if that file is even compressed, compression is optional basically.

View this streaming from a POV of some WebUI that wants to show content of the packed file in browser form for example. In this case it is simpler to get that extra data out of annotation and way safer memory wise and stream content from blob if that is requested.

Similar thing if done using tar you would need to partially parse each tar header file to get data out first, when parse again for content if that was requested.

Specifically for FS use case tar is perfectly fine.

Additional Ref: https://github.com/cri-o/cri-o/issues/8953

Mar 21 '25 13:03 s3rj1k