dstack
dstack copied to clipboard
dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.
### Problem These service configuration properties only affect how the proxy handles requests to the service: - `port` - `model` - `https` - `auth` - `rate_limits` Although these properties do...
### Problem Currently, Dstack only supports using `pd-balanced` GCE Disks. For higher performance persistent storage on GCP, we want the ability to specify the underlying disk type (e.g. `hyperdisk-balanced`). ###...
### Problem Currently the` env:` configuration does not support variable interpolation. This means that when we define environment variables like: ``` env: - NUM_SHARD=$DSTACK_GPUS_NUM ``` The value is not evaluated...
Steps to reproduce: 1. Create an SSH fleet with one or more hosts 2. Disable the connection between the `dstack` server and the hosts of the fleet 3. Wait for...
Following #2455 A significant part of the gateway provisioning time is dependencies installation. Moving to uv would reduce it to a few seconds. For new gateways, uv can be installed...
### Problem It can take several minutes (10+) to pull the Docker image to the runner instance depending on the image size and network speed. Currently, `dstack` only shows that...
### Problem Currently, both `dstack server` and `dstack` CLI require direct SSH access to VMs. Similar to #2554 for GCP, AWS recommends [using SSH via Session Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-enable-ssh-connections.html) to enforce the...
Currently, we require the user to specify `image` always when using AMD. It would be cool if we provide a small and up-to-date AMD image with ROCm drivers.
### Problem When applying new SSH fleets, I faced a "Provisioning timeout expired" with no further information. I had to restart the server using `--log-level=DEBUG`. This is okay, when I...
### Problem To follow security best practices we need to lock down SSH access to GCP VMs to only IAP authorized users. To do this, we can limit the source...