dstack
dstack copied to clipboard
[Bug]: Cannot re-create GCP and Azure gateways
Steps to reproduce
- Create a gateway on GCP or Azure.
> cat gateways/gcp.dstack.yml
type: gateway
name: gcp
backend: gcp
region: europe-west4
domain: example.com
> dstack apply -f gateways/gcp.dstack.yml -y
BACKEND REGION NAME HOSTNAME DOMAIN DEFAULT STATUS
gcp europe-west4 gcp example.com submitted
- Wait until the gateway is running (see server logs or
dstack gateway -w) - Re-create the gateway
> dstack apply -f gateways/gcp.dstack.yml -y --force
Actual behaviour
Gateway provisioning fails.
> dstack gateway
BACKEND REGION NAME HOSTNAME DOMAIN DEFAULT STATUS
gcp europe-west4 gcp example.com failed
Expected behaviour
Gateway becomes running after re-creation.
dstack version
master
Server logs
[23:52:33] ERROR dstack._internal.server.background.tasks.process_gateways:145 Got exception when creating gateway compute for gateway gcp
Traceback (most recent call last):
File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/background/tasks/process_gateways.py", line 122, in
_process_submitted_gateway
gateway_model.gateway_compute = await create_gateway_compute(
File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/services/gateways/__init__.py", line 127, in create_gateway_compute
gpd = await run_async(
File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/utils/common.py", line 23, in run_async
return await asyncio.get_running_loop().run_in_executor(None, func_with_args)
File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/core/backends/gcp/compute.py", line 439, in create_gateway
operation = self.instances_client.insert(request=request)
File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/cloud/compute_v1/services/instances/client.py", line 4130,
in insert
response = rpc(
File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
return wrapped_func(*args, **kwargs)
File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/api_core/grpc_helpers.py", line 76, in
error_remapped_callable
return callable_(*args, **kwargs)
File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/cloud/compute_v1/services/instances/transports/rest.py",
line 3197, in __call__
raise core_exceptions.from_http_response(response)
google.api_core.exceptions.Conflict: 409 POST https://compute.googleapis.com/compute/v1/projects/dstack/zones/europe-west4-c/instances: The
resource 'projects/dstack/zones/europe-west4-c/instances/gcp' already exists
Additional information
AWS gateways are re-created successfully.
A similar problem occurs when re-creating GCP fleets - the same exception in server logs, although it does not lead to fleets failing because dstack retries fleet provisioning.