oras-go
oras-go copied to clipboard
Should `oras.Copy()` follow manifests specified in the `layers` or the `blobs` field?
Currently, oras.Copy()
follows all the successors of a node to copy all the sub-DAGs.
For manifests specified in the layers
field of OCI Image manifests or the blobs
field of OCI Artifact manifests, should oras.Copy()
treat them as leaf nodes or follow them to copy the sub-DAGs?
Scenario 1: Repository
Suppose there is such a DAG, where manifest B references manifest A as one of its layers, and manifest list C references A* (identical to A) as one of its manifests.
When copying this DAG to a remote repository, how should oras.Copy()
handle manifest A and A*?
Doubtlessly, manifest A* should be treated as a non-leaf node and should be pushed to the repository via the manifest endpoint. But how about manifest A? If it is treated as a leaf node, should it be pushed to the repository via the blob endpoint?
-
If so, manifest A and A* need to be pushed twice and may be stored separately in the blob storage and manifest storage in the remote repository.
-
If not, manifest A and A* will be pushed just once via the manifest endpoint. If manifest A is copied first, blob E won't get copied along with it and will never be copied, since manifest A* will be skipped for copy.
graph TD
A[Manifest A]
AS[Manifest A*]
B[Manifest B]
C[Manifest List C]
D[Manifest List D]
E[Blob E]
A -.-> E
AS --> E
B -- layers --> A
C -- manifests --> AS
D --> B
D --> C
Scenario 2: OCI Layout
Suppose there is such a DAG, where manifest A is referenced by manifest B as a layer and is referenced by manifest list C as a manifest.
When copying this DAG to an OCI layout, oras.Copy()
will copy manifest A only once, whether or not it treats manifest A as a leaf node (to be copied along with manifest B), since OCI layout stores manifests and blobs in the same storage.
But if manifest A is copied as a leaf node along with manifest B and this happens before manifest list C is copied, blob E will never get copied.
graph TD
A[Manifest A]
B[Manifest B]
C[Manifest List C]
D[Manifest List D]
E[Blob E]
A --> E
B -- layers --> A
C -- manifests --> A
D --> B
D --> C
Scenario 3: Repository Double CASs
Suppose the below DAG is being copied to a remote repository, should manifest A be pushed via the manifest endpoint or via the blob endpoint? Or should it be pushed twice via both endpoints?
graph TD
A[Manifest A]
B[Manifest B]
B -- layers --> A
B -- subject --> A
Interestingly, the docker buildx build
command generates build caches like this: Putting layers in the manifests
field of an OCI image index.
When copying such structure to a remote repository, should oras.Copy()
push these layers (specified as manifests) via the manifest endpoint or the blob endpoint? 🤔
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.index.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1",
"size": 32,
"annotations": {
"buildkit/createdat": "2023-01-13T07:49:09.921545067Z",
"containerd.io/uncompressed": "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef"
}
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:d74d7d17ce90514c5eed8068791ab9b1d58f355a367c6a87bd3e0e1dc8113500",
"size": 105,
"annotations": {
"buildkit/createdat": "2023-01-13T07:49:09.864832789Z",
"containerd.io/uncompressed": "sha256:601bb128dc20e9b8a296510b1c840d58dfd7d596ae1396d52e886753423c052c"
}
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:df9b9388f04ad6279a7410b85cedfdcb2208c0a003da7ab5613af71079148139",
"size": 2814559,
"annotations": {
"buildkit/createdat": "2023-01-13T07:48:28.219213701Z",
"containerd.io/uncompressed": "sha256:4fc242d58285699eca05db3cc7c7122a2b8e014d9481f323bd9277baacfa0628"
}
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:eb630b592770ba0b3982595e566c1027966cf6b9733c5fc1bf0794bf6bc2c9cd",
"size": 3578366,
"annotations": {
"buildkit/createdat": "2023-01-13T07:49:09.693226029Z",
"containerd.io/uncompressed": "sha256:3cb741a610a6253327467f4bb4e3de9397c36846b2407dc56992c04475ced968"
}
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"digest": "sha256:ed0f0d4a18721d4dc5d5d8ffb7eaeb0df00ab5d1001bfa594419a1b8dd5ffc09",
"size": 2581904,
"annotations": {
"buildkit/createdat": "2023-01-13T07:48:34.097896893Z",
"containerd.io/uncompressed": "sha256:6d69e1b372ea8a2e13783b213f4cf108be422a05740625a209a054b48c9a76cd"
}
},
{
"mediaType": "application/vnd.buildkit.cacheconfig.v0",
"digest": "sha256:f06f3ad8ce85bcc973c15a11f419e7601e74db8db0e7af8d05587d24d77ffc83",
"size": 2407
}
]
}
We may need to introduce a new method to return leaf successors and non-leaf successors separately, as a complement to content.Successors()
.
https://github.com/oras-project/oras-go/blob/76382aaa94873ad14fddacdbff0f5ed32f43c3aa/content/graph.go#L47-L106