Caching with multi-stage builds and multiple RUNs
Hi, first of all - I would like to thank you for all your work - Kaniko is very tool and I love to use it.
I wanted to ask about caching in regard of multi-stage builds and having multiple RUNs in each. I am using kaniko version 1.2.0, debug inside GitLab CI.
I have realised that caching only works fine in with following Dockerfile
FROM python:3.8-alpine AS build-deps
RUN apk add --no-cache install gcc musl-dev yaml-dev
FROM build-deps AS build
RUN pip install --prefix=/install PyYAML
FROM python:3.8-alpine
COPY --from=build /install /usr/local
in short - each stage has only one RUN directive and there are no special shell tricks in each of them.
When I try to use either multiple RUN commands or shell tricks (even like \ to split installation of packages into multiple lines or && to band commands together), no cache is used. It seems like cache layers are produced only after whole stage, not after each RUN directive. Kaniko tries to find cached layer for given RUN value which is different because there are multiple ones or shell tricks make it different than was retrieved from image.
The question is: is this correct behaviour?
Sorry for chaotic description, I tried to describe it as simple as possible. I do not know if this is intended behaviour or bug hence no bug report template for issue was used.
@stanislaw55,
Kaniko tries to find cached layer for given RUN value which is different because there are multiple ones or shell tricks make it different than was retrieved from image.
Do you expect for a Multi-Run command e..g
RUN pip install --prefix=/install PyYAML && echo "text" > somefile
The results of each command i.e. pip install --prefix=/install PyYAML and echo "text" > somefile are cached individually?
Kaniko does not implement caching like that but computes the cache for the whole command ``RUN pip install --prefix=/install PyYAML && echo "text" > somefile`
Does that help?
Hi @tejal29
What I meant by 'multi run command' is having more than one RUN directives in a single stage. Sorry if I wasn't clear.
And yes, I realised that Kaniko caching works with shell tricks like && but not work multiple RUN directives in one stage. My bad for not testing thoroughly before.
When I run Kaniko with debug level of verbosity, I've seen that with Kaniko cache to work, RUN directive hasto be the last one in stage. If there's some other directive, like USER, after RUN, cache stops working. Like in the following Dockerfile snippet
FROM python:3.8-alpine
RUN apk add --no-cache gcc g++ musl-dev libffi-dev openssl-dev yaml-dev && pip install --upgrade pip setuptools wheel cython && adduser -D builder
USER builder
in this case cache won't work but if you delete USER directive, caching will work just fine.
The same is with multiple RUNs, example:
FROM python:3.8-alpine
RUN apk add --no-cache gcc g++ musl-dev libffi-dev openssl-dev yaml-dev
RUN pip install --upgrade pip setuptools wheel cython
RUN adduser -D builder
cache won't work again.
The only cache-working example I've been able to get is this:
FROM python:3.8-alpine AS user
RUN apk add --no-cache gcc g++ musl-dev libffi-dev openssl-dev yaml-dev
&& pip install --upgrade pip setuptools wheel cython \
&& adduser -D builder
I do not know if that's okay but it is different from how docker build cache works and it took me a while to figure it out.