dstack
dstack copied to clipboard
dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.
**1.1. [Long-term] Registering an existing network volume** They want to be able to register an existing volume (just like we already support for AWS). Example for AWS: ```yaml type: volume...
Steps to reproduce 1. Create a gateway 2. Run a service 3. Delete the gateway 4. Create a gateway Actual behavour: 1. Coudn't create a gateway second time (while the...
### Problem Users cannot find out which instance a particular job is running on. Knowing the exact instance can be useful in many cases, such as terminating an underperforming instance....
### Problem dstack apparently does not select the best offer available in terms of compute resources and price. Consider the partial config below: ```yaml ... resources: gpu: A5000:24GB:1 memory: 30GB.....
### Problem First, thank you for the great work on this project! I have an on-prem server with one A100 GPU, and I’m using ssh-fleet with auto block enabled. My...
### Steps to reproduce 1. Apply a service configuration with `name` set. For simpler reproduction, use `replicas: 0..1`. 2. Simulate a server-to-gateway communication issue, e.g., turn off the network on...
### Problem Currently, `dstack-runner` always [listens](https://github.com/dstackai/dstack/blob/b6aa7e89781d59ac2caebf3590afbf42a3a43d4a/runner/cmd/runner/main.go#L41) on all network interfaces, including the instance's public IP address if the container runs in [host network mode](https://github.com/dstackai/dstack/blob/b6aa7e89781d59ac2caebf3590afbf42a3a43d4a/src/dstack/_internal/core/models/common.py#L118). In general, this is not an...
By default, runs have no retry policy. If a run can’t find capacity at submission time, it fails with no offers. In most cases, this isn’t the desired default. A...
### Steps to reproduce When attached via `dstack apply`, if the connection drops (possibly due to a laptop going to sleep and coming back), dstack will notify the user and...
### Steps to reproduce 1. Clone a Git repo using SSH auth Example: ```shell git clone [email protected]:peterschmidt85/git-submodule-parent.git cd git-submodule-parent ``` 2. Add a submodule using HTTPS auth ```shell git submodule...