cromwell
cromwell copied to clipboard
Cromwell GCP error - The referenced network resource cannot be found
Hello,
I am new to cromwell and trying to run a test workflow on GPC. I am using the PAPIv2 backend and here is my config:
$ cat genomics.conf | grep -v '#' | sed '/^$/d'
include required(classpath("application"))
google {
application-name = "cromwell"
auths = [
{
name = "application-default"
scheme = "application_default"
}
]
}
engine {
filesystems {
gcs {
auth = "application-default"
project = "xxxxx"
}
}
}
backend {
default = PAPIv2
providers {
PAPIv2 {
actor-factory = "cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory"
config {
project = "xxxxx"
root = "gs://xxxx/cromwell_execution"
virtual-private-cloud {
network-label-key = "xxx"
subnetwork-label-key = "xxx"
auth = "application-default"
}
name-for-call-caching-purposes: PAPI
slow-job-warning-time: "24 hours"
genomics-api-queries-per-100-seconds = 1000
maximum-polling-interval = 600
request-workers = 3
genomics {
auth = "application-default"
endpoint-url = "https://genomics.googleapis.com/"
location = "us-west1"
restrict-metadata-access = false
localization-attempts = 3
parallel-composite-upload-threshold="150M"
}
filesystems {
gcs {
auth = "application-default"
project = "xxxx"
caching {
duplication-strategy = "copy"
}
}
http { }
}
default-runtime-attributes {
cpu: 1
failOnStderr: false
continueOnReturnCode: 0
memory: "2048 MB"
bootDiskSizeGb: 10
disks: "local-disk 10 SSD"
noAddress: false
preemptible: 0
zones: ["us-west1-a", "us-west1-b"]
}
include "papi_v2_reference_image_manifest.conf"
}
}
}
}
When I run with the above config using:
java -Dconfig.file=genomics.conf -jar cromwell-66.jar run cumulus.wdl -i cumulus_inputs.json
I am getting the following error message:
[2021-08-24 22:05:33,60] [info] WorkflowManagerActor: Workflow 6cc303b4-295d-49fa-a996-b5cf7ec9beea failed (during ExecutingWorkflowState): java.lang.Exception: Task cumulus.cluster:NA:1 failed. The job was stopped before the command finished. PAPI error code 3. Execution failed: allocating: creating instance: inserting instance: Invalid value for field 'resource.networkInterfaces[0].network': ''. The referenced network resource cannot be found.
I have tried passing the vpc and subnet id using the following config:
virtual-private-cloud {
network-label-key = "xxx"
subnetwork-label-key = "xxx"
auth = "application-default"
}
The above values are my actual vpc and subnet id/name. However, it is still giving me that error message. Is there something I am missing from a configuration perspective. Any help would be greatly appreciated. Our VPC network's are not created in auto mode and that is not something we have control over unfortunately.
Thanks, -Simran
I have created a new lablels file and using that to pass the VPC/subnet info but still get the same error:
$ grep -i label genomics.conf
network-label-key = "my-private-network"
subnetwork-label-key = "my-private-subnetwork"
$ cat labels.json
{
"my-private-network": "xxxx",
"my-private-subnetwork": "xxxx"
}
and updated my cromwell command to the following:
java -Dconfig.file=genomics.conf -jar cromwell-66.jar run cumulus.wdl -i cumulus_inputs.json -l labels.json
I still get the same error though. Is this even possible or am I missing something?
Thanks.
In Cromwell versions 67 and earlier virtual-private-cloud
configuration exclusively specifies Google project label keys, not literal values. The actual values are specified in labels on the Google project. For example with a VPC config like:
virtual-private-cloud {
network-label-key = "my-network-label-key"
subnetwork-label-key = "my-subnetwork-label-key"
auth = "application-default"
}
As seen in the labels page in GCP console, there should be project labels with key/values of my-network-label-key
/my-private-network
and my-subnetwork-label-key
/my-private-subnetwork
.
Thanks @mcovarr for your response. I realized that after my initial post and created a labels.json with the following contents:
{
"google_labels": {
"my-private-network": "xxx",
"my-private-subnetwork": "yyy"
}
}
where xxx and yyy are my actual vpc network and subnet names in GCP. Then I added the "-l labels.json" option to the cromwell run command but that still gives me the same error. Am I missing something here? Apologies but this is what I am understanding from the posts/docs that needs to happen but won't work when I try it. Am I supposed to create some label in the actual GCP account as well?
Thanks.
Ahh I think I see what you mean. I don't need the "-l labels.json" but need to create an actual Label in the GCP account that has the following key/value:
my-private-network: xxx my-private-subnetwork: yyy
I don't have access to create the labels but will have someone do this and try again. Let me know if I am still missing something.
Thanks.
@mcovarr which resource does the label need to be created in? Your link took me to the IAM & Admin Label's section for GCP. Is that where I should create this label or on the GCP instance resource that I am running this command from?
Thanks.
The labels should be created on the GCP project (not on the GCE instance), so the link should be going to the correct location.
Thanks @mcovarr, that seems to have worked. The job has gone into a Running state. Really appreciate your quick response and assistance.
-Simran
@mcovarr, looks like I got past the initial issue but now getting the following error:
[2021-08-25 01:11:31,83] [info] WorkflowManagerActor: Workflow 2a7b8039-a555-4f58-86b0-dc4a6fa21dff failed (during ExecutingWorkflowState): java.lang.Exception: Task cumulus.cluster:NA:1 failed. The job was stopped before the command finished. PAPI error code 9. generic::failed_precondition: Constraint constraints/compute.trustedImageProjects violated for project gred-cumulus-sb-01-991a49c4. Use of images from project cloud-lifesciences is prohibited.
Looks like our GCP accounts don't allow non standard images. Which image is this workflow trying to use? Is there a way to provide our own image to this pipeline instead?
Thanks
I am not familiar with that error message. From a bit of Googling it looks like this may be relevant. Assuming cloud-lifesciences
is Google's project hosting the image that Cloud Life Sciences is trying to use to spin up the worker VM, you may need to add projects/cloud-lifesciences
to your organization's trusted image projects.