dstack icon indicating copy to clipboard operation
dstack copied to clipboard

[Bug]: Cannot re-create GCP and Azure gateways

Open jvstme opened this issue 1 year ago • 0 comments

Steps to reproduce

  1. Create a gateway on GCP or Azure.
> cat gateways/gcp.dstack.yml 
type: gateway
name: gcp
backend: gcp
region: europe-west4
domain: example.com

> dstack apply -f gateways/gcp.dstack.yml -y
 BACKEND  REGION        NAME  HOSTNAME  DOMAIN              DEFAULT  STATUS    
 gcp      europe-west4  gcp             example.com                  submitted
  1. Wait until the gateway is running (see server logs or dstack gateway -w)
  2. Re-create the gateway
> dstack apply -f gateways/gcp.dstack.yml -y --force

Actual behaviour

Gateway provisioning fails.

> dstack gateway
 BACKEND  REGION        NAME  HOSTNAME  DOMAIN              DEFAULT  STATUS    
 gcp      europe-west4  gcp             example.com                  failed

Expected behaviour

Gateway becomes running after re-creation.

dstack version

master

Server logs

[23:52:33] ERROR    dstack._internal.server.background.tasks.process_gateways:145 Got exception when creating gateway compute for gateway gcp                  
                    Traceback (most recent call last):                                                                                                         
                      File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/background/tasks/process_gateways.py", line 122, in                     
                    _process_submitted_gateway                                                                                                                 
                        gateway_model.gateway_compute = await create_gateway_compute(                                                                          
                      File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/services/gateways/__init__.py", line 127, in create_gateway_compute     
                        gpd = await run_async(                                                                                                                 
                      File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/server/utils/common.py", line 23, in run_async                                 
                        return await asyncio.get_running_loop().run_in_executor(None, func_with_args)                                                          
                      File "/usr/lib64/python3.8/concurrent/futures/thread.py", line 57, in run                                                                
                        result = self.fn(*self.args, **self.kwargs)                                                                                            
                      File "/home/jvstme/git/dstack/dstack/src/dstack/_internal/core/backends/gcp/compute.py", line 439, in create_gateway                     
                        operation = self.instances_client.insert(request=request)                                                                              
                      File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/cloud/compute_v1/services/instances/client.py", line 4130,
                    in insert                                                                                                                                  
                        response = rpc(                                                                                                                        
                      File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__       
                        return wrapped_func(*args, **kwargs)                                                                                                   
                      File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/api_core/grpc_helpers.py", line 76, in                    
                    error_remapped_callable                                                                                                                    
                        return callable_(*args, **kwargs)                                                                                                      
                      File "/home/jvstme/git/dstack/dstack/venv/lib64/python3.8/site-packages/google/cloud/compute_v1/services/instances/transports/rest.py",  
                    line 3197, in __call__                                                                                                                     
                        raise core_exceptions.from_http_response(response)                                                                                     
                    google.api_core.exceptions.Conflict: 409 POST https://compute.googleapis.com/compute/v1/projects/dstack/zones/europe-west4-c/instances: The
                    resource 'projects/dstack/zones/europe-west4-c/instances/gcp' already exists

Additional information

AWS gateways are re-created successfully.

A similar problem occurs when re-creating GCP fleets - the same exception in server logs, although it does not lead to fleets failing because dstack retries fleet provisioning.

jvstme avatar Sep 30 '24 21:09 jvstme