buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

BuildKit --cache-from not working

Open HealthyPear opened this issue 3 years ago • 0 comments

Premises

  • I tried also without BuildKit with the same result (so only calling --cache-from option, but from the docs of docker build it seems mandatory to use BuildKit so I tried it)
  • probably related to e.g. #2274 ...
  • ...or probably not because I am not an expert and I might doing things wrongly and/or inefficiently :)
  • as far as I can see, I am using BuildKit correctly within the GitLab CI
  • my suspect is that this is a file permission problem of something like it
  • I don't think my cache is invalidate within builds, because it's a sequence of jobs associated with the same commit (even if I am copying the whole git repo inside the image, which I know it's a no-no, but I want the user to have the repo when it enters the container)
  • I am not using COPY statements in between stages, but it shouldn't be needed in principle, all the stuff should remain there and I indeed checked by pulling image n.2 that the installed things are still there
  • it doesn't even work locally, meaning that I pull the 2nd stage image from the GitLab Container Registry, and I do e.g. DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --target aerie --cache-from registry.gitlab.com/swgo-collaboration/aerie-install/externals:test-docker -t test_build_aerie:test-docker . and I see the same and I see on the job's log
  • apparently a single CI job made of pull/build/push instruction works fine as I expect, my doubt is why it doesn't work with 1 job per stage if I am pulling the previous built image from the container registry in each job...shouldn't it behave in the same way?

THE PROBLEM (aka: what I was expecting)

  • I can see that if I re-run the first job "base" it works with all the associated cache
  • also the second job uses the correct cache (the "base" image)
  • the third job should use the cache from the second stage "externals", but instead it does everything again from after COPY --chown=swgo . /home/swgo/aerie-install which is right before the definition of the 2nd stage build.

I provide here my dockerfile and gitlab CI config, hoping that my problem is easy to spot by experts!

Dockerfile

# Set the base image
# First build stage: install OS dependencies and setup user environment
FROM ubuntu:18.04 AS base

# Define labels
LABEL version="0.1" \
    maintainer="Michele Peresano" \
    description="Docker image to create containers for using SWGO-AERIE"

# Install basic packages
RUN DEBIAN_FRONTEND=noninteractive apt-get update -y -qq < /dev/null > /dev/null &&\
    apt-get install -y -qq git vim wget curl \
    dpkg-dev cmake g++ gcc binutils libx11-dev libxpm-dev \
    libxft-dev libxext-dev python libssl-dev \
    gfortran libpcre3-dev \
    xlibmesa-glu-dev libglew1.5-dev libftgl-dev \
    libmysqlclient-dev libfftw3-dev libcfitsio-dev \
    graphviz-dev libavahi-compat-libdnssd-dev \
    libldap2-dev python-dev libxml2-dev libkrb5-dev \
    libgsl0-dev qtwebengine5-dev libxmu-dev < /dev/null > /dev/null

# Define non-root user and set his home directory
ARG USERNAME=swgo
ENV HOME /home/swgo

# Set some other environment variables for the Docker build process
ENV AERIE_INSTALL=$HOME/aerie-install
ENV INSTALL_DIR=$HOME/SWGO-AERIE
ENV CONDA_PATH=$HOME/mambaforge

# Add user 'swgo' with sudo powers
# in case user wants to install something else afterwards
# Create workdir before Docker does it and give ownership to new user
RUN useradd -rm -d $HOME -s /bin/bash -u 1000 ${USERNAME} \
    && usermod -aG sudo ${USERNAME} && mkdir $AERIE_INSTALL && chown swgo $AERIE_INSTALL
# Create user for Docker build
USER ${USERNAME}

# Override default shell and use bash
SHELL ["/bin/bash", "--login", "-c"]

# Set working directory to aerie-install repository
WORKDIR $AERIE_INSTALL

# Copy repository to the filesystem of the container
# (the .dockerignore file prevents to copy also the Dockerfile itself)
COPY --chown=${USERNAME} . $AERIE_INSTALL

# Second build stage: install conda and externals with APE
FROM base as externals

# Create password file for non-interactive download of external dependencies
RUN ls -l $HOME && ls -l\
    && touch ape_password.txt \
    && echo "(NoPasswordNeededPressReturn )" >> ape_password.txt \
    && chmod 0600 ape_password.txt

# Launch installation script
# - install AERIE and externals to $INSTALL_DIR
# - use 4 threads
# - install mambaforge/conda to $CONDA_PATH and initialize swgo_env
# - install externals with APE using defaults, password file and silent wget
# - get latest aerie commit from git submodule
# - build and install AERIE (without tests, which are partially failing for now...)
# - update .bashrc to have the environment initialized at startup
RUN bash main.sh \
    -d $INSTALL_DIR \
    -n 4 \
    -c $CONDA_PATH \
    -e \
    -a ,,,$PWD/ape_password.txt,yes

# Third build stage: build and install AERIE
FROM externals as aerie

# - Remove old initialization file, it will be re-written
# - conda installed, required environment created, so -e will just check/activate it
# - APE should just find installed packages and not re-download source files
# - AERIE is built and installed
# .bashrc is updated with 'initialize_swgo_aerie' function
RUN rm $INSTALL_DIR/initialize_swgo_aerie.sh \
    && bash main.sh \
    -d $INSTALL_DIR \
    -n 4 \
    -e $CONDA_PATH \
    -a ,,,$PWD/ape_password.txt,yes \
    -u -b -i \
    -s ,function

# Fouth and final build stage: final setup on the container environment
FROM aerie

# Update login shell to activate conda base environment and SWGO-AERIE environment
RUN echo "alias ls='ls --color'" >> $HOME/.bashrc \
    && echo "mambaforge" >> $HOME/.bashrc \
    && echo "initialize_swgo_aerie" >> $HOME/.bashrc

# Set welcome directory to HOME and customize user's bashrc
WORKDIR $HOME

# This command will be executed when entering the container
# - bash will be the shell in use
# - the SWGO-AERIE environment will be initialized
ENTRYPOINT ["bash"]

.gitlab-ci.yaml

image: docker:20.10.17
services:
  - docker:20.10.17-dind

default:
  tags: # use the available specific runner by default
    - aerie-install
    - test

stages:
  - build-base
  - build-externals
  - build-aerie
  - build-final
  - test
  - release

variables:
  # Use TLS https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#tls-enabled
  DOCKER_HOST: tcp://docker:2376
  # Specify to Docker where to create the certificates. Docker
  # creates them automatically on boot, and creates
  # `/certs/client` to share between the service and job
  # container, thanks to volume mount from config.toml
  DOCKER_TLS_CERTDIR: "/certs"
  DOCKER_BUILDKIT: "1"
  BASE_IMAGE: $CI_REGISTRY_IMAGE/base:$CI_COMMIT_REF_SLUG
  EXTERNALS_IMAGE: $CI_REGISTRY_IMAGE/externals:$CI_COMMIT_REF_SLUG
  TEST_AERIE_IMAGE: $CI_REGISTRY_IMAGE/test_build_aerie:$CI_COMMIT_REF_SLUG
  CONTAINER_TEST_IMAGE: $CI_REGISTRY_IMAGE/test_final:$CI_COMMIT_REF_SLUG
  CONTAINER_RELEASE_IMAGE: $CI_REGISTRY_IMAGE/release:$CI_COMMIT_REF_SLUG

before_script:
  - docker info
  - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY

build-base:
  stage: build-base
  script:
    # Pull the older version of the "base" stage if it's available
    # The "|| true" part makes the shell ignore the error if the pulled image does not exist yet.
    - docker pull $BASE_IMAGE || true
    - docker build --build-arg BUILDKIT_INLINE_CACHE=1 --target base --cache-from $BASE_IMAGE -t $BASE_IMAGE .
    - docker push $BASE_IMAGE

build-externals:
  stage: build-externals
  script:
    - docker pull $BASE_IMAGE || true
    - docker pull $EXTERNALS_IMAGE || true
    - docker build
      --build-arg BUILDKIT_INLINE_CACHE=1
      --target externals
      --cache-from $BASE_IMAGE
      --cache-from $EXTERNALS_IMAGE
      -t $EXTERNALS_IMAGE .
    - docker push $EXTERNALS_IMAGE

build-aerie:
  stage: build-aerie
  script:
    - docker pull $EXTERNALS_IMAGE || true
    - docker pull $TEST_AERIE_IMAGE || true
    - docker build
      --build-arg BUILDKIT_INLINE_CACHE=1
      --target aerie
      --cache-from $EXTERNALS_IMAGE
      --cache-from $TEST_AERIE_IMAGE
      -t $TEST_AERIE_IMAGE .
    - docker push $TEST_AERIE_IMAGE

build-final:
  stage: build-final
  script:
    - docker pull $TEST_AERIE_IMAGE || true
    - docker pull $CONTAINER_TEST_IMAGE || true
    - docker build
      --build-arg BUILDKIT_INLINE_CACHE=1
      --cache-from $TEST_AERIE_IMAGE
      --cache-from $CONTAINER_TEST_IMAGE
      -t $CONTAINER_TEST_IMAGE .
    - docker push $CONTAINER_TEST_IMAGE

test:
  stage: test
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker run $CONTAINER_TEST_IMAGE -c "bash main.sh -k ~/SWGO-AERIE"

release-image:
  stage: release
  script:
    - docker pull $CONTAINER_TEST_IMAGE || true
    - docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE
    - docker push $CONTAINER_RELEASE_IMAGE

HealthyPear avatar Jul 28 '22 15:07 HealthyPear