model-transparency icon indicating copy to clipboard operation
model-transparency copied to clipboard

Considerations for using the OCI spec for signing and verifying models

Open font opened this issue 9 months ago • 9 comments

Question

Have we considered using the OCI spec as the format for signing and verifying models? Or perhaps consider tooling to translate between the current format in this repo and the OCI spec format?

Background

Since the cloud-native ecosystem is heavily reliant on containers, it may make sense to leverage the tooling in that ecosystem for at least packaging and deploying ML models. However, if we use the solution in this project -- a detached signature of a generated manifest that contains file:hash entries of each file in the model repo -- and then package all that up into an OCI container image that is then subsequently also signed (i.e. using both solutions introduces repetitive hashing and signing), it removes a lot of the advantages that were solved by the original approach taken in this project.

Current Approach

The current approach in sigstore/model-transparency signs a manifest of file:hash pairs and stores it as a detached signature (e.g., model.sig), allowing verification of ML models by recomputing hashes from local files. While effective, this requires access to the full model files for verification, which can be inefficient for large models (e.g., gigabyte-scale checkpoints), especially during deployment into Kubernetes.

OCI Approach

In contrast, OCI (Open Container Initiative) images, as used for container images, offer a model where each layer has an independent digest, listed in a manifest, which is signed (e.g., via cosign) and stored in a registry. Verification tools like the Sigstore Policy Controller fetch only the manifest and signature from the registry, avoiding layer downloads until usage. This separation of verification (lightweight metadata) from usage (full content) could enhance efficiency for ML model signing and verifying within cloud native applications e.g. Kubernetes.

Sample Implementation

  1. Model as OCI Image:
    • Treat each model file (e.g., weights, config) as an OCI layer, or bundle the model directory into one or more layers using a custom media type (e.g., application/vnd.ai.model.layer).
    • Generate an OCI image manifest listing layer digests, mirroring the current file:hash manifest.
  2. Signing:
    • Sign the manifest using sigstore-python or cosign, producing a detached signature (e.g., @sha256:.sig), consistent with the current DSSE approach.
    • Store the manifest and signature in an OCI registry alongside the model layers.
  3. Verification:
    • Extend the verify command to fetch the manifest and signature from the registry (instead of local files) and verify the signature against the manifest digest.
    • For usage, pull layers and verify their digests against the manifest, separating verification from content access.

Benefits

  • Reduced Verification Overhead: Verification requires only the manifest and signature (kilobytes) rather than the full model (gigabytes), improving performance.
  • Registry Support: OCI registries provide a standardized storage and distribution mechanism, potentially compatible with platforms like Hugging Face.
  • Alignment with Sigstore: Builds on cosign’s OCI signing workflow, ensuring consistency with container image practices.

Relation to Current Approach

The current manifest of file:hash pairs is conceptually similar to an OCI manifest listing layer digests. This proposal adapts that idea to OCI’s structure, shifting verification to trust the registry’s manifest (like container images) rather than recomputing hashes locally. It retains the detached signature model but leverages OCI’s ecosystem for efficiency.

At a minimum, we could explore the packaging of ML models as OCI images, where model files (or directories) are treated as layers, the manifest lists their digests, and the manifest is signed with a detached signature. This could align with the current manifest-based approach while leveraging OCI’s ecosystem and verification efficiency.

The crux of the issue may boil down to what ends up being the de facto packaging and distribution standard for ML models, especially for cloud-native environments like Kubernetes.

font avatar Apr 05 '25 02:04 font

Right now we are indeed comparing manifest form signature with manifest from local model, but we are planning to support partial verification.

It is somewhat captured in #160 and https://github.com/sigstore/model-transparency/blob/621211a1e91b396e6335be576da9e3698d456311/src/model_signing/manifest.py#L31-L35

mihaimaruseac avatar Apr 07 '25 20:04 mihaimaruseac

Right now we are indeed comparing manifest form signature with manifest from local model, but we are planning to support partial verification.

It is somewhat captured in #160 and

model-transparency/src/model_signing/manifest.py

Lines 31 to 35 in 621211a

name and in associated hash. In the future we will support partial object matching. This is useful, for example, for the cases where the original model contained files for multiple ML frameworks, but the user only uses the model with one framework. This way, the user can verify the integrity only for the files that are actually used.

I think OCI should be able to support partial verification. If each file were a separate layer then when you want to perform a partial verification, you regenerate the manifest for only those layers (i.e. files) that have changed. This would be done by recomputing each layers' digests (since they're independent) and then the model's OCI image manifest would be updated to reflect each layers' updated digest along with the newly updated manifest's root digest.

That is, for AI models packaged as OCI images, it could support partial verification in a few ways:

  1. Subset of Layers: If we map model files (e.g., weights, config) to individual layers, a verifier could fetch the manifest and signature, then selectively pull and verify only specific layers’ digests. For instance, verifying a config file’s layer without downloading a massive weights layer could confirm part of the model’s integrity.
  2. Custom Policy: A verification tool (e.g., an extended verify.py) could enforce policies like “approve if at least the metadata layer’s digest matches,” enabling partial trust decisions without the full model.
  3. Incremental Checks: Since layer digests are independent, you could verify layers incrementally as they’re pulled, stopping early if a critical subset fails, which aligns with partial verification workflows.

font avatar Apr 09 '25 01:04 font

please see some of the discussion in https://github.com/cncf/sandbox/issues/358#issuecomment-2784998207

dims avatar Apr 10 '25 19:04 dims

We already practice this as part of KitOps. The heart of the KitOps project is an OCI artifact named ModelKit that is used to package AI artifacts, including models. KitOps is a CNCF project. You can find out more about it on kitops.org, but in short, it implements an OCI artifact that closely resembles what is described above.

At Jozu, the company where I work, we host our OCI registry aka Jozu Hub for ModelKits and sign our models (such as this one) using cosign. There is even greater potential in the use of OCI though. For instance, Jozu Hub also scans the models for serialization attacks and attaches the scan results as attestations. This security page simply retrieves the OCI attestations and renders them. However, some of our users leverage this mechanism to define additional checks and policies for their use cases.

We are also moving full steam ahead to create a specification around this to improve interoperability and have proposed the ModelPack project to CNCF to collaborate on this work.

We would be happy to hear your feedback both on the ModelPack specification or on the KitOps project for more practical usages. We are a friendly bunch over there 😄

gorkem avatar Apr 10 '25 20:04 gorkem

Hi, @gorkem

Jozu Hub also scans the models for serialization attacks and attaches the scan results as attestations.

I just curious about how this is implemented. Could you kindly point the specific code files in kitops code base?

Thank you!

caozhuozi avatar Apr 12 '25 13:04 caozhuozi

@caozhuozi This implementation is part of Jozu product at this time.

gorkem avatar Apr 24 '25 16:04 gorkem

We already practice this as part of KitOps. The heart of the KitOps project is an OCI artifact named ModelKit that is used to package AI artifacts, including models. KitOps is a CNCF project. You can find out more about it on kitops.org, but in short, it implements an OCI artifact that closely resembles what is described above.

At Jozu, the company where I work, we host our OCI registry aka Jozu Hub for ModelKits and sign our models (such as this one) using cosign. There is even greater potential in the use of OCI though. For instance, Jozu Hub also scans the models for serialization attacks and attaches the scan results as attestations. This security page simply retrieves the OCI attestations and renders them. However, some of our users leverage this mechanism to define additional checks and policies for their use cases.

We are also moving full steam ahead to create a specification around this to improve interoperability and have proposed the ModelPack project to CNCF to collaborate on this work.

We would be happy to hear your feedback both on the ModelPack specification or on the KitOps project for more practical usages. We are a friendly bunch over there 😄

@gorkem Thanks for the info! How do you view this model-transparency project in relation to the ModelPack project?

font avatar Apr 25 '25 18:04 font

@font Hi, If you are interested how these 2 project can work toghether, you could join our slack channel to discuss it: https://cloud-native.slack.com/archives/C07T0V480LF (under CNCF workspace) ❤️

caozhuozi avatar May 09 '25 05:05 caozhuozi

@font I think ModelPack, KitOps, and model-transparency are complementary projects. Although we use OCI (Open Container Initiative)—which provides established mechanisms for attestations and provenance—the lack of AI/ML–specific standards for these artifacts continues to impede interoperability.

For example, the SLSA for ML is important but we should be aware that security and risk management for AI/ML has additional capabilities like model versioning, data lineage, performance metrics, fairness or bias attestations that needs to be addressed. Some of these requirements fall naturally within the scope of ModelPack standardization, while others align better with this project’s goals.

To demonstrate interoperability between our efforts, I propose we plan joint activities. First, let’s schedule a meet-and-greet session: we can host your team on the ModelPack bi-weekly call. Then, you can introduce your team members to ModelPack and we can explore further collaboration opportunities.

gorkem avatar May 09 '25 13:05 gorkem