dstack
dstack copied to clipboard
Do not fail if user-specified Docker image is non-root
Steps to reproduce
Try running a configuration with a non-root image.
> cat prometheus.dstack.yml
type: task
image: bitnami/prometheus
ports:
- 9090
resources:
memory: 0.5GB..
cpu: 1..
> dstack run . -f prometheus.dstack.yml
Actual behaviour
The run fails. CLI:
Configuration prometheus.dstack.yml
Project main
User admin
Pool name default-pool
Min resources 1..xCPU, 0.5GB..
Max price -
Max duration 72h
Spot policy auto
Retry policy no
Creation policy reuse-or-create
Termination policy destroy-after-idle
Termination idle time 300s
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 aws us-west-2 t2.small 1xCPU, 2GB, 100GB (disk) yes $0.004
2 aws ap-southeast-1 t2.small 1xCPU, 2GB, 100GB (disk) yes $0.0062
3 aws eu-central-1 t2.small 1xCPU, 2GB, 100GB (disk) yes $0.0068
...
Shown 3 of 761 offers, $49.159 max
Continue? [y/n]: y
spotty-monkey-1 provisioning completed (failed)
Run failed with error code JobTerminationReason.INTERRUPTED_BY_NO_CAPACITY. Check CLI and server logs for more
details.
Server logs:
ERROR 2024-04-04T11:41:36.084 dstack._internal.server.background.tasks.process_running_jobs The docker container of the job 'spotty-monkey-1-0-0' is not working: exit code: 127, error
DEBUG 2024-04-04T11:41:36.085 dstack._internal.server.background.tasks.process_running_jobs runner healthcheck: {'state': 'pending', 'container_name': 'spotty-monkey-1-0-0', 'status': 'exited', 'running': False, 'oom_killed': False, 'dead': False, 'exit_code': 127, 'error': ''}
shim.log
on the cloud instance:
Reading package lists...
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (2: No such file or directory)
/bin/sh: 1: yum: not found
Expected behaviour
The configuration runs successfully.
dstack version
0.17.0
Server logs
No response
Additional information
The main error here is E: List directory /var/lib/apt/lists/partial is missing. - Acquire (2: No such file or directory)
. It happens because the bitnami/prometheus
image is non-root. See https://stackoverflow.com/a/57930100 and https://docs.bitnami.com/tutorials/work-with-non-root-containers/