cromwell icon indicating copy to clipboard operation
cromwell copied to clipboard

WX-1828 Demo broken private Docker Hub repos in GCP Batch

Open mcovarr opened this issue 6 months ago • 0 comments

Description

PR to demo broken private Docker repo support in GCP Batch. There are actually multiple existing PAPI v2 Centaur tests in this vein; the one test enabled here for GCP Batch seems to be the simplest and demonstrates the issues clearly enough.

The crux of this test is that the Docker image that is specified for the task is in a private repo to which the Centaur service account has been granted access. This test passes on PAPI v2 but on GCP Batch jobs fail with messages like the following visible in gcloud batch jobs describe:

Job state is set from RUNNING to FAILED for job projects/1005074806481/locations/us-central1/jobs/job-27607753-d2d5-404d-89af-a786da8ad383.Job
      failed due to task failure. Specifically, task with index 0 failed due to the
      following task event: "Task state is updated from RUNNING to FAILED on zones/us-central1-b/instances/8098872438472929780
      with exit code 125."

Exit code 125 being a typical "something's wrong with that Docker invocation" error.

in Cloud Logging I see the following, including what looks like a plaintext password which I have x'd out below:

Executing runnable container:{image_uri:"broadinstitute/cloud-cromwell@sha256:0d51f90e1dd6a449d4587004c945e43f2a7bbf615151308cff40c15998cc3ad4" commands:"/mnt/disks/cromwell_root/script" entrypoint:"/bin/bash" volumes:"/mnt/disks/cromwell_root:/mnt/disks/cromwell_root" username:"firecloud" password:"xxxxx"} labels:{key:"tag" value:"UserRunnable"} for Task task/job-27607753-d2d5-132dc052-df92-4db100-group0-0/0/0 in TaskGroup group0 of Job job-27607753-d2d5-132dc052-df92-4db100.

So it looks like the GCP Batch backend has acquired and plumbed through the required Docker credentials, but the login to Docker Hub doesn't seem to have happened.

Release Notes Confirmation

CHANGELOG.md

  • [ ] I updated CHANGELOG.md in this PR
  • [ ] I assert that this change shouldn't be included in CHANGELOG.md because it doesn't impact community users

Terra Release Notes

  • [ ] I added a suggested release notes entry in this Jira ticket
  • [ ] I assert that this change doesn't need Jira release notes because it doesn't impact Terra users

mcovarr avatar Aug 26 '24 19:08 mcovarr