gitlab-ci-local
gitlab-ci-local copied to clipboard
error during connect: Get "https://docker:2376/v1.24/containers/json?all=.." tls: failed to verify certificate: x509: certificate signed by unknown authority
TL;DR - same job template, each one use different docker-compose.yaml, one is working fine, and one is getting errors during DIND injection.
Minimal .gitlab-ci.yml illustrating the issue
---
Component Test out_of_disk:
stage: component_test
image: docker:24.0.2-git
services:
- name: registry.hub.docker.com/library/docker:24.0.2-dind
alias: docker
before_script:
- docker-compose ${ENV_FILE:+--env-file $ENV_FILE} -f "$TEST_COMPOSE_PATH" $(test
-f "$NETWORK_COMPOSE_PATH" && echo "-f $NETWORK_COMPOSE_PATH") up -d --force-recreate
&& sleep 40
script:
- |
docker run -v ${CI_PROJECT_DIR}:${CI_PROJECT_DIR} \
test-server:latest /bin/bash -c \
"pytest -v ${CI_PROJECT_DIR}/$PATH_TO_TEST_FILS"
Expected behavior getting test working
Host information Ubuntu 22.04 gitlab-ci-local 4.46.1
Containerd binary docker
Additional context I have two jobs which use the same template job, one is using docker compose A, and the other docker compose B. The one that use compose A, is finish successfully, the other one fail with error:
Component Test out_of_disk $ docker-compose ${ENV_FILE:+--env-file $ENV_FILE} -f "$TEST_COMPOSE_PATH" $(test -f "$NETWORK_COMPOSE_PATH" && echo "-f $NETWORK_COMPOSE_PATH") up -d --force-recreate && sleep 40
Component Test out_of_disk > error during connect: Get "https://docker:2376/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Ddocker%22%3Atrue%7D%7D": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "docker:dind CA")
Compose a:
version: "3.5"
services:
redis-db:
image: redislabs/rejson:2.0.11
container_name: redis-db
restart: unless-stopped
ports:
- 6379:6379
networks:
- runner
garbage-collector:
image: ${DOCKER_DOWNLOAD_NEXUS_REGISTRY}/app/garbage_collector/garbage_collector:v0.1.98
container_name: garbage-collector
restart: unless-stopped
environment:
DC_HOME: /dc_home
volumes:
- ${DC_HOME}:/dc_home
ports:
- 60000:60000
networks:
- runner
runner:
image: ${DOCKER_DOWNLOAD_NEXUS_REGISTRY}/app/runner/pyrunner-generic-cpu:${VERSION}
container_name: runner
restart: unless-stopped
environment:
DC_HOME: /dc_home
volumes:
- ${CI_PROJECT_DIR}/test/configurations/component_test_configuration.ini:/configuration.ini:ro
- /netapp:/netapp
- /tmp:/tmp:rw
- ${CI_PROJECT_DIR}/docker/algorithmic_solutions_list_mock.txt:/home/scripts/algorithmic_solutions_list.txt
- ${CI_PROJECT_DIR}/docker/deployment_settings_list.txt:/home/deployments/deployment_settings_list.txt
- ${DC_HOME}:/dc_home
ports:
- 8080:8080
networks:
- runner
networks:
runner:
compose b:
version: "3.5"
services:
redis-db:
image: redislabs/rejson:2.0.11
container_name: redis-db
restart: unless-stopped
ports:
- 6379:6379
networks:
- runner
garbage-collector:
image: ${DOCKER_DOWNLOAD_NEXUS_REGISTRY}/app/garbage_collector/garbage_collector:v0.1.98
container_name: garbage-collector
restart: unless-stopped
environment:
DC_HOME: /dc_home
volumes:
- ${DC_HOME}:/dc_home
ports:
- 60000:60000
networks:
- runner
runner:
image: ${DOCKER_DOWNLOAD_NEXUS_REGISTRY}/app/runner/pyrunner-generic-cpu:${VERSION}
container_name: runner
restart: unless-stopped
environment:
DC_HOME: /dc_home
volumes:
- ${CI_PROJECT_DIR}/test/configurations/component_test_configuration.ini:/configuration.ini:ro
- /netapp:/netapp
- /tmp:/tmp:rw
- data-storage-vol:/tmp/data/jobs/component_test/job_123456:rw
- ${CI_PROJECT_DIR}/docker/algorithmic_solutions_list_mock.txt:/home/scripts/algorithmic_solutions_list.txt
- ${CI_PROJECT_DIR}/docker/deployment_settings_list.txt:/home/deployments/deployment_settings_list.txt
- ${DC_HOME}:/dc_home
ports:
- 8080:8080
networks:
- runner
volumes:
data-storage-vol:
driver_opts:
type: "tmpfs"
device: "tmpfs"
o: "size=${RAM_DRIVE_SIZE:?err},uid=1000"
networks:
runner:
My .gitlab-ci-local-env file:
PRIVILEGED=true
ULIMIT=8000:16000
VOLUME="/etc/docker/daemon.json:/etc/docker/daemon.json certs:/certs/client /netapp:/netapp"
VARIABLE="DOCKER_TLS_CERTDIR=/certs DOCKER_BUILDKIT=0 COMPOSE_DOCKER_CLI_BUILD=0"
Please, get rid of as many factors and yaml as possible.
This is not an as simple as possible example at all 😁
It makes it overly difficult for others to debug.
Please, get rid of as many factors and yaml as possible.
This is not an as simple as possible example at all 😁
It makes it overly difficult for others to debug.
Is it better now?
---
test-job:
image: docker:24.0.2-git
services:
- name: registry.hub.docker.com/library/docker:24.0.2-dind
alias: docker
script:
- docker version
Using this .gitlab-ci-local-env
PRIVILEGED=true
ULIMIT=8000:16000
VOLUME="/etc/docker/daemon.json:/etc/docker/daemon.json certs:/certs/client /netapp:/netapp"
VARIABLE="DOCKER_TLS_CERTDIR=/certs DOCKER_BUILDKIT=0 COMPOSE_DOCKER_CLI_BUILD=0"
I'm not seeing the symptom, when using a simple example.
https://gitlab.com/firecow/gitlab-ci-debugging/-/jobs/6254313223
Strip down your example, line by line, instruction by instruction, and eventually I'm sure we will get to the bottom of this.
I would be nice to add some sort of hint for others that might end up in a similar situation. What was your mistake?
I would be nice to add some sort of hint for others that might end up in a similar situation. What was your mistake?
Not reproducible :( The next business day, everything worked, not sure why.