charliecloud icon indicating copy to clipboard operation
charliecloud copied to clipboard

remove trailing/unimportant whitespace from build instruction cache entry

Open qwofford opened this issue 3 years ago • 2 comments

If I add whitespace that does not affect program behavior, such as a trailing whitespace before a newline, the build cache will not hit on the cache. It would be good if the cache was used any time a modified command would not affect the outcomes.

qwofford avatar Sep 26 '22 23:09 qwofford

To reproduce:

$ printf 'FROM alpine:3.9\nRUN echo hi\n' | ch-image build -t foo -
  1* FROM alpine:3.9
  2. RUN echo hi
updating existing image ...
available --force: alpine: Alpine, any version
hi
grown in 2 instructions: foo
$ printf 'FROM alpine:3.9\nRUN echo hi \n' | ch-image build -t foo -
  1* FROM alpine:3.9
  2. RUN echo hi
updating existing image ...
available --force: alpine: Alpine, any version
hi
grown in 2 instructions: foo

Note the trailing space after hi in the second build causes a cache miss. The desired behavior is that it would hit, since the command is equivalent.

reidpr avatar Sep 27 '22 16:09 reidpr

This is actually kind of hairy because with the shell form of RUN (e.g., RUN echo hello), we pass the line unchanged to the shell, including all whitespace, because Charliecloud doesn't understand shell syntax (and you can even do wacky things like set the shell to Python).

With the “exec” form of RUN that bypasses the shell, it does work as expected:

$ printf 'FROM alpine:3.9\nRUN ["echo", "hi2"]\n' | bin/ch-image build -t foo -
  1* FROM alpine:3.9
  2. RUN ["echo", "hi2"]
copying image ...
available --force: alpine: Alpine, any version
hi2
grown in 2 instructions: foo
$ printf 'FROM alpine:3.9\nRUN ["echo", "hi2" ]\n' | bin/ch-image build -t foo -  1* FROM alpine:3.9
warning: no metadata to load; using defaults
  2* RUN ["echo", "hi2"]
copying image ...
grown in 2 instructions: foo

reidpr avatar Sep 27 '22 16:09 reidpr