pants icon indicating copy to clipboard operation
pants copied to clipboard

environments-preview feature not exchanging DOCKER_HOST

Open juftin opened this issue 1 year ago • 2 comments

Describe the bug

Enabling environments breaks the docker build process in CI/CD and complains that the DOCKER_HOST is in the wrong place.


We have a docker_image that depends on a pex_binary

BUILD
pex_binary(
    name = "bin",
    dependencies = [
        ":lib",
    ],
    execution_mode = "venv",
    include_tools = True,
    layout = "packed",
)

docker_image(
    name = "docker",
    dependencies = [
        ":docker_resources",
        ":bin",
    ],
    image_tags = ["{build_args.PANTS_DISTRIBUTION_VERSION}"],
)

Due to some wheel compatibility issues, we cannot build this pex_binary (and subsequent docker_image) on our Macs, it only works up in CI/CD (self-hosted GitHub Actions runners) where the machine is linux_x86.

To resolve this issue for ourselves locally, we enabled the environments feature

BUILD
local_environment(
  name="local_linux_x86",
  description="Localhost x86 Linux Environment",
  compatible_platforms=["linux_x86_64"],
  fallback_environment="docker_x86",
)

docker_environment(
  name="local_docker_x86",
  description="Dockerized x86 Linux Environment",
  platform="linux_x86_64",
  image="python:3.8.16",
)

And we attached the environment to the pex_binary

BUILD
pex_binary(
    name = "bin",
    dependencies = [
        ":lib",
    ],
    execution_mode = "venv",
    include_tools = True,
    layout = "packed",
    environment = "local_linux_x86",
)

The idea here is that the pex_binary needs to be built on a linux_x86_64 platform and when that platform isn't available locally it should just use the docker_environment.

This solution is working for us locally, but when we run it up in CI/CD it no longer works. Here is the issue that we're seeing in CI/CD:

Log Output
15:54:05.51 [DEBUG] Starting: Scheduling: Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-batch-job:16.0.0-beta.1
15:54:05.54 [DEBUG] Starting: acquire_command_runner_slot
15:54:19.81 [DEBUG] Completed: setup_sandbox
15:54:19.86 [DEBUG] spawned local process as Some(1133) for Process { argv: ["/usr/bin/docker", "build", "--pull=False", "--tag", "***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:latest", "--tag", "***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:16.0.0-beta.1", "--build-arg", "PANTS_DISTRIBUTION_VERSION", "--file", "services/moz-helper-services/Dockerfile", "."], env: {"PANTS_DISTRIBUTION_VERSION": "16.0.0-beta.1", "PATH": "/tmp/pants-sandbox-q6Zu6z/.shims/bin", "__UPSTREAM_IMAGE_IDS": ""}, working_directory: None, input_digests: InputDigests { complete: DirectoryDigest { digest: Digest { hash: Fingerprint<2c4455e78a6e491e19a07eca09e00a2fc9d026b6e428026680cb253cbbf4d541>, size_bytes: 264 }, tree: "Some(..)" }, nailgun: DirectoryDigest { digest: Digest { hash: Fingerprint<e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855>, size_bytes: 0 }, tree: "Some(..)" }, input_files: DirectoryDigest { digest: Digest { hash: Fingerprint<79b1e31b408edaf0ecb646a2deaf492f0d62436d46c7e020659821ef142b3191>, size_bytes: 184 }, tree: "Some(..)" }, immutable_inputs: {RelativePath(".shims"): DirectoryDigest { digest: Digest { hash: Fingerprint<e6c1927a092ff6beaa1fa4f372248c975cb51869b1c28245cb0281ebc6816ecb>, size_bytes: 78 }, tree: "Some(..)" }}, use_nailgun: {} }, output_files: {}, output_directories: {}, timeout: None, execution_slot_variable: None, concurrency_available: 0, description: "Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:latest +1 additional tag.", level: Info, append_only_caches: {}, jdk_home: None, platform: Linux_x86_64, cache_scope: PerSession, execution_strategy: Local, remote_cache_speculation_delay: 0ns }
15:54:19.89 [INFO] Completed: Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:latest +1 additional tag.
15:54:19.89 [DEBUG] Completed: Scheduling: Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:latest +1 additional tag.
15:54:19.89 [DEBUG] Completed: acquire_command_runner_slot
15:54:19.89 [DEBUG] Running Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-result-processor:16.0.0-beta.1 under semaphore with concurrency id: 1, and concurrency: 1
15:54:19.89 [INFO] Starting: Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-result-processor:16.0.0-beta.1
15:54:19.89 [DEBUG] Starting: setup_sandbox
15:54:19.89 [WARN] Docker build failed for `docker_image` services/moz-helper-services:docker. The services/moz-helper-services/Dockerfile has `COPY` instructions for source files that may not have been found in the Docker build context.

However there are possible matches. Please review the following list of suggested renames:

  * services.moz-helper-services/bin.pex => services/moz-helper-services


15:54:19.90 [DEBUG] Completed: `publish` goal
15:54:19.90 [DEBUG] computed 1 nodes in 44.638043 seconds. there are 13299 total nodes.
15:54:19.90 [ERROR] 1 Exception encountered:

Engine traceback:
  in select
    ..
  in pants.core.goals.publish.run_publish
    `publish` goal
  in pants.core.goals.publish.package_for_publish
    ..
  in pants.core.goals.package.environment_aware_package
    ..
  in pants.backend.docker.goals.package_image.build_docker_image
    ..

Traceback (most recent call last):
  File "/home/runner/.cache/pants/setup/bootstrap-Linux-x86_64/2.15.1rc2_py38/lib/python3.8/site-packages/pants/engine/internals/selectors.py", line 593, in native_engine_generator_send
    res = func.send(arg)
  File "/home/runner/.cache/pants/setup/bootstrap-Linux-x86_64/2.15.1rc2_py38/lib/python3.8/site-packages/pants/backend/docker/goals/package_image.py", line 309, in build_docker_image
    raise ProcessExecutionFailure(
pants.engine.process.ProcessExecutionFailure: Process 'Building docker image ***.dkr.ecr.us-east-1.amazonaws.com/moz-helper-services:latest +1 additional tag.' failed with exit code 1.
stdout:

stderr:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

As far as I can tell, enabling the environments feature confuses pants about where the DOCKER_HOST should be (our GitHub runners have DOCKER_HOST set to unix:///run/docker/docker.sock instead of the default unix:///var/run/docker.sock). We pass this environment variable down in our pants.toml file though:

pants.toml
[docker]
env_vars = [
  "DOCKER_CONFIG=%(homedir)s/.docker",
  "DOCKER_DEFAULT_PLATFORM=linux/amd64",
  "HOME",
  "USER",
  "PATH",
  # used by action-runner-controller dind
  "DOCKER_CERT_PATH",
  "DOCKER_HOST",
  "DOCKER_TLS_VERIFY",
]
tools = [
  "dirname",
  "readlink",
  "python3",
  # These may be necessary if using Pyenv-installed Python.
  "cut",
  "sed",
  "bash",
  "sh",
]
default_repository = "{directory}"
build_args = ["PANTS_DISTRIBUTION_VERSION"]

For solutions to this we've tried upgrading to pants==2.15.1rc2, removing the local_environment completely and just using the docker_environment, and also using pants.ci.toml to override which environment to use, and even hardcoding the DOCKER_HOST env var inside of pants.ci.toml.

All of this worked within CI/CD when we specify no environments at all, but doesn't work locally. However it works locally when we do specify an environment, but doesn't work in CI/CD.

Pants version 2.15.0 / 2.15.1rc2

OS Linux (CI/CD)

Additional info Add any other information about the problem here, such as attachments or links to gists, if relevant.

juftin avatar May 05 '23 16:05 juftin

I think I've found a workaround for this by adding the DOCKER_HOST to the local_environment docker_env_vars. E.g.:

local_environment(
  name="local_linux_x86",
  description="Localhost x86 Linux Environment",
  compatible_platforms=["linux_x86_64"],
  fallback_environment="docker_x86",
  docker_env_vars=["DOCKER_HOST"]
)

I can't reproduce this locally running Pants from sources (on a mac).

It seems like the local_environment config is effectively unsetting DOCKER_HOST for the subsequent docker image build (which is not configured to use the environment). I had a quick look at the source code, but it wasn't clear to me where this would be happening.

riisi avatar Apr 04 '24 01:04 riisi