charliecloud
charliecloud copied to clipboard
remove trailing/unimportant whitespace from build instruction cache entry
If I add whitespace that does not affect program behavior, such as a trailing whitespace before a newline, the build cache will not hit on the cache. It would be good if the cache was used any time a modified command would not affect the outcomes.
To reproduce:
$ printf 'FROM alpine:3.9\nRUN echo hi\n' | ch-image build -t foo -
1* FROM alpine:3.9
2. RUN echo hi
updating existing image ...
available --force: alpine: Alpine, any version
hi
grown in 2 instructions: foo
$ printf 'FROM alpine:3.9\nRUN echo hi \n' | ch-image build -t foo -
1* FROM alpine:3.9
2. RUN echo hi
updating existing image ...
available --force: alpine: Alpine, any version
hi
grown in 2 instructions: foo
Note the trailing space after hi in the second build causes a cache miss. The desired behavior is that it would hit, since the command is equivalent.
This is actually kind of hairy because with the shell form of RUN (e.g., RUN echo hello), we pass the line unchanged to the shell, including all whitespace, because Charliecloud doesn't understand shell syntax (and you can even do wacky things like set the shell to Python).
With the “exec” form of RUN that bypasses the shell, it does work as expected:
$ printf 'FROM alpine:3.9\nRUN ["echo", "hi2"]\n' | bin/ch-image build -t foo -
1* FROM alpine:3.9
2. RUN ["echo", "hi2"]
copying image ...
available --force: alpine: Alpine, any version
hi2
grown in 2 instructions: foo
$ printf 'FROM alpine:3.9\nRUN ["echo", "hi2" ]\n' | bin/ch-image build -t foo - 1* FROM alpine:3.9
warning: no metadata to load; using defaults
2* RUN ["echo", "hi2"]
copying image ...
grown in 2 instructions: foo