dstack
dstack copied to clipboard
dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.
### Steps to reproduce 1. Open project settings in two tabs (simulate two admins editing the project simultaneously). 2. In the first tab, add User A to the project. 3....
**Essential:** * [x] Request resources according to the `dstack` configuration * [x] Multi-node support (distributed tasks running on fleets with cluster placement) **Strategic:** * [x] AMD GPUs support * [...
### Steps to reproduce I was trying to create SSH fleet with Tenstorrent chips with `blocks: auto` so that we can deploy workloads which use less than all chips on...
### Problem Right now the retry_policy works with a time window. It'd be great to have a max num of retries it does within that window or without. ### Solution...
### Problem Along with the current max duration logic, which is good for saving cost, it would be great to have an option where the duration only counts "running" time....
### Problem RunPod is an offline provider which means sometimes offers are stale, causing provisioning to fail with "no offer". ### Solution Make Runpod an online provider so offers are...
This tracks the roadmap for implementing native inference capabilities inside dstack. Currently LLM inference systems (SGLang, Dynamo, Grove, LLM-d, Ai-brix, SGLang OME) revolve around inference-native concepts: TTFT/ITL autoscaling, PD disaggregation,...
Currently, dstack fleets are optional for cloud backends, which means users have two ways of provisioning via dstack: using fleet configurations or run configurations. Moreover, fleet configurations are not supported...
### Steps to reproduce This example demonstrates the problem using fleets, but it can also be reproduced with any `prev_*_at` request fields when listing other resources (events, instances, etc). 1....