roadmap icon indicating copy to clipboard operation
roadmap copied to clipboard

Introduce a way to group commands in a single layer in a Dockerfile

Open eteran opened this issue 5 years ago • 5 comments

Tell us about your request I think it would be very useful to be able to group a bunch of Dockerfile commands into a single layer transaction. For example:

# we use a multi-stage build to compile all files for the site...

COPY --from=builder /dist /var/www
RUN find /dist/ -type d -print0 | xargs -n100 -P4 -0 chmod 0755 # for directories
RUN find /dist/ -type f -print0 | xargs -n100 -P4 -0 chmod 0644 # for files

Can cause things to unexpectedly take up a lot more space because each of those 3 commands potentially introduces a new layer. We can do things like use multi-stage builds to set the permissions, and then copy from there... but that seems like severe overkill for this purpose. I think a simple solution would be like this:

LAYER_START
# an illustrative example, there are circumstances where it would be nice to 
# be able to do this outside of multi-stage builds too!

COPY --from=builder /dist /var/www
RUN find /dist/ -type d -print0 | xargs -n100 -P4 -0 chmod 0755 # for directories
RUN find /dist/ -type f -print0 | xargs -n100 -P4 -0 chmod 0644 # for files
LAYER_END

which would basically cause anything in between a LAYER_START/LAYER_END pair to be squashed. This would allow a chain of commands which are intended to be an atomic change, to only create a single layer with no unnecessary wasted space.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Basically, some sets of Docker commands make sense to be an atomic transaction. There should be a simple mechanism for that instead of having to introduce a multi-stage build every time it's needed.

Are you currently working around the issue? Yes, I use a multi-stage build that generates the files and sets the permissions then copy from there.

eteran avatar Nov 03 '20 20:11 eteran

This sounds a bit like an expansion on the HEREDOC proposal: https://github.com/moby/moby/issues/34423

ingshtrom avatar Nov 10 '20 14:11 ingshtrom

@ingshtrom to an extent, but unlike the HEREDOC proposal, my proposal would allow mixing multiple types of commands as well.

eteran avatar Nov 10 '20 20:11 eteran

@eteran it would be interesting to have the commands share an environment as well, so e.g. Python virtual environments would work within the layer bookends.

robertlagrant avatar Apr 12 '21 18:04 robertlagrant

Want to bump this thread. I think being able to have finer grained control on when Docker creates a layer would be very helpful.

red-avalanche avatar May 06 '24 18:05 red-avalanche

I couldn't find the proposal anywhere, but this one is a general version.

Why there is still no support for COPYing multiple directories to a <dest>ination? I think everyone at least once stumbled on this. You either want to do cp -r dir dest/ or cp -r dir1 dir2 dest/ but in the end only the "contents are copied" leaving your perfect project structure behind.

And here is where the overlap comes in: to accomplish what cp can do, you have to either write:

COPY dir dest/dir

or:

COPY dir1 dest/dir1
COPY dir2 dest/dir2

In the first case, there is only a simple but still annoying duplication of the directory's name, but in the second one, there is a "duplication" of the entire command. So, if you need to copy 2 or 10 directories, then you have to write 2 or 10 (!) COPY commands just to preserve the project structure. This means that multiple unnecessary layers are being made only because the COPY's syntax isn't flexible enough.

This issue can fix this by instead merging the adjacent COPY commands into a single new layer (instead of 10). For the image efficiency, this is good enough, but this wouldn't fix the Dockerfile's bloatness of the "hundreds" of adjacent COPY lines.

The reason COPY . . wouldn't fly (as the only reasonable solution + .dockerignore), is because imagine you have some separate stuff that you want to copy at the very end (some static content that doesn't need compiling or anything). Then, if this content is changed, then the COPY . . command will be re-run (which is run before some compilation steps). So for this simple reason (which may or may not be a rare case) the problems is still present.

Another 2 "solutions" would be more "hacks" because it requires additional disk space and write operations prior to the docker build. I'm talking about either doing mkdir -f dirs && cp -r dir1 dir2 dirs and then COPY dirs dest/ or tar cf dirs_and_maybe_also_files.tar dir1 dir2 and then ADD dirs.tar dest/. In addition to the extra space, preprocessing etc. you would also have to remove what you've created during preprocessing (rm -rf dirs/rm -f dirs.tar).

So all 3 currently available workarounds have some downsides which. This kinda defeats the purpose of just writing the Dockerfile and then just building the image. It becomes a hassle, and ain't no one like that.

Do I create a separate issue about updating the COPY(/ADD) command? Maybe there is already a discussion about this somewhere?

Andrew15-5 avatar Jun 22 '24 16:06 Andrew15-5