image icon indicating copy to clipboard operation
image copied to clipboard

[feature request][performance] sort layers by size before remote-to-remote copy

Open mirekphd opened this issue 3 years ago • 8 comments

When copying between two remote registries skopeo copy (like docker pull) has no problem with parallel and "out-of-sequence" pulls and pushes, because image layers never need to be extracted during such remote-to-remote copying (sequence would matter only if tarballs extraction was needed).

Layers can differ significantly in terms of size (from 1k for ARG/ENV layer to 1 TB and more for e.g. CUDA libraries layers). So why not beat docker at transfer speeds and sort layers before pulls or pushes by their descending size rather than by their Dockerfile-induced sequence?

mirekphd avatar Aug 31 '22 11:08 mirekphd

Thanks for reaching out.

Can you elaborate on why the order matters?

As for pulls: the order must be preserved as the layers must be applied to the local storage in the exact order.

vrothberg avatar Aug 31 '22 11:08 vrothberg

If I understand it correctly, the idea is that a layer that is 50% of the total image size should be started first, so that the others can all be pulled in parallel with the big one, and the total time is about the same as the time to pull the big layer, instead of spending time copying 10 smaller layers, and only after most of that is done, starting the big layer.

That only makes a difference for moderately unbalanced images, where the largest layer is probably > 1/6 of the total image size, but not something like 99%.


I think it’s an interesting optimization worth exploring. We can’t/shouldn’t do that for c/storage, and due to compatibility we’d need an opt-in anyway, but that’s not too bad.

We might want to think about the UI impact — e.g. should we list the progress bars in the original image order, to show the user what’s going on? Currently we create the progress bars only when we start pulling, in order. That might end up being the most complex part of the feature.

mtrmac avatar Aug 31 '22 14:08 mtrmac

Adequate unbalancing is guaranteed in many containerized python applications for example, which have to be based on Ubuntu, so the base image layer is much larger than the application layers (all the way up to the NVIDIA CUDA images with their astoundingly heavy 3.5... GB base images). The problem is if the unbalanced images are pre-sorted already, and this unfortunately looks likely, as the base layer is first already, so the size-sorting might not make much of a difference in practice.

On the other hand, the forking has to be done anyway, and altering its sequence does not add any extra overhead, so unless there is some noticeable overhead on gathering layer sizes and sorting them or on accessing server-side layers "out of order", this new method should be always outperforming the current method, regardless of how small or unnoticeable (and performance gains should be double, because they should be also achievable during the push phase). I suspect the main reason why this is has not been done already like this is the way in which the legacy system from which skopeo inherited operates. The docker pull however has a very different use case - to run the container after the pull is complete, rather than to immediately push it somewhere else.

mirekphd avatar Aug 31 '22 17:08 mirekphd

The way c/storage is set up, pulls must create layers from base to the last child, in order (they have parent links).

Now, whether that’s a 100% hard requirement, where we just can’t create the child before the parent, or more of an implementation choice, depends on the graph driver (it‘s 100% hard for device-mapper-snapshots, and it might be a choice for overlay, but I’m not quite sure). Even if it were 100% an implementation choice, that would be a pretty large implementation effort (we would need to have a concept of an extracted diff that is not yet a layer, a mechanism to turn that into a layer quickly, and a cleanup mechanism to delete that extracted diff on unexpected aborts).


For direct registry-to-registry copies, this should be quite easy to do; the progress UI is the hardest part, the rest is just mechanical work. (But note that such copies are not pulls+pushes with a disk intermediary; they are direct streaming copies, so there are no “double” gains.)

For pushes, I think it’s same as registry-to-registry copies, but there’s a small chance I’m missing something.

mtrmac avatar Aug 31 '22 17:08 mtrmac

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Oct 01 '22 00:10 github-actions[bot]

You would also take up more temporary space as the blobs would exist on disk for a longer point of time. Currently once a blob is downloaded, completely that layer is applied to storge and the layer is removed.

But if this is a minor change, I think we should do it.

rhatdan avatar Oct 01 '22 09:10 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Nov 01 '22 00:11 github-actions[bot]

Moving to c/image; this would be transparent to Skopeo itself.

mtrmac avatar Dec 05 '22 21:12 mtrmac