distribution-spec icon indicating copy to clipboard operation
distribution-spec copied to clipboard

performance: what can dist-spec do to improve downloads of large images/layers?

Open rchincha opened this issue 1 year ago • 7 comments

Things we already allow/do:

  1. Parallel download of layers

Things we can likely improve:

  1. For large layers, range-based downloads - download sections of a large file using Range Header and stitch them back together?

Things known to the community:

  1. https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md ^ fuse based filesystem solutions - copy individual things when referenced

rchincha avatar May 28 '24 17:05 rchincha

For streaming/lazy loading:

Multipart layer downloads with range requests:

  • https://github.com/containerd/containerd/pull/10177
  • https://github.com/awslabs/amazon-ecr-containerd-resolver#parallel-downloads

Direct mounts of compressed tars (saves on extraction time):

  • https://github.com/containerd/containerd/pull/9362#issuecomment-1809829145
  • https://github.com/containers/composefs

samuelkarp avatar May 28 '24 18:05 samuelkarp

What changes are needed in distribution-spec to support this? Is a pointer to the HTTP specs documenting range requests enough?

sudo-bmitch avatar May 28 '24 18:05 sudo-bmitch

https://github.com/opencontainers/image-spec/issues/1190

rchincha avatar May 28 '24 18:05 rchincha

What changes are needed in distribution-spec to support this? Is a pointer to the HTTP specs documenting range requests enough?

IMO, what server/client side optimizations can be enabled by dist-spec changes? demonstrably?

Just like conformance, we should write benchmark code for this.

rchincha avatar May 28 '24 20:05 rchincha

Just like conformance, we should write benchmark code for this.

Is that an OCI requirement, or something implementations should be doing?

sudo-bmitch avatar Jun 08 '24 15:06 sudo-bmitch

Just like conformance, we should write benchmark code for this.

Is that an OCI requirement, or something implementations should be doing?

Not an OCI requirement - our conformance suite should already ensure conformant registries can handle range-based blob pulls.

https://github.com/opencontainers/distribution-spec/pull/537#issuecomment-2155821558 ^ will be interesting to see data from various registries. If viable, clients should move to this model advised by blob size.

rchincha avatar Jun 09 '24 17:06 rchincha

#546

rchincha avatar Jul 16 '24 17:07 rchincha