dstack
dstack copied to clipboard
[Bug]: A container creation error on an instance inaccurately sets the job status
dstack version
0.16.5
Python version
3.11
Host OS
Arch Linux
Host Arch
x86_64
What happened?
When the job is in the pulling state, the container is being prepared for run on the instance (The equivalents of commands docker pull
and docker create
are executed). Any error that happens during the preparation of a container on an instance is always recorded as JobErrorCode.INTERRUPTED_BY_NO_CAPACITY
in the job's error status. This is wrong.
For container creation errors, we need to make the JobErrorCode.CREATING_CONTAINER_ERROR
and correctly specify it in the job.
CLI logs
❯ dstack run . -f t.dstack.yaml -y
Configuration t.dstack.yaml
Project main
User admin
Pool name default-pool
Min resources 2..xCPU, 8GB..
Max price -
Max duration 72h
Spot policy auto
Retry policy no
Creation policy reuse-or-create
Termination policy destroy-after-idle
Termination idle time 300s
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 gcp us-east1 e2-standard-2 2xCPU, 8GB, 100GB (disk) yes $0.020103 Idle
2 gcp us-east1 e2-standard-2 2xCPU, 8GB, 100GB (disk) yes $0.020103
3 gcp us-east1 e2-highmem-2 2xCPU, 16GB, 100GB (disk) yes $0.02712
...
Shown 3 of 303 offers, $36.59 max
lazy-lion-1 provisioning completed (failed)
Run failed with error code JobErrorCode.INTERRUPTED_BY_NO_CAPACITY. Check CLI and server logs for more details.
Server logs
No response
Runner logs
No response
Additional Information
No response