dstack icon indicating copy to clipboard operation
dstack copied to clipboard

dstack is an open-source alternative to Kubernetes, designed to simplify development, training, and deployment of AI across any cloud or on-prem. It supports NVIDIA, AMD, and TPU.

Results 325 dstack issues
Sort by recently updated
recently updated
newest added

### Steps to reproduce After moving around examples, links from Clusters examples to github are broken: https://github.com/dstackai/dstack/blob/a444b846e51a5b025c0aed081a65d6b5296508fa/examples/clusters/a3high/README.md#L229 Also the examples themselves are not reproducible since the paths to configurations are...

bug
docs
no-stale

### Steps to reproduce 1. Run `dstack offer --group-by gpu,backend`. 2. Observe that it shows all REGIONS that backend has for each GPU even if this GPU only available for...

bug
no-stale

### Problem User should be able to see everyone project and everyone own roles for it ![Image](https://github.com/user-attachments/assets/a2e64f7d-27aa-49a1-9b5c-c8f5901fd56e) ### Solution - [ ] Add column Role on user project list page...

feature
ui
no-stale

### Steps to reproduce Apply the configuration: ```yaml type: service image: nginx port: 80 replicas: 0..1 scaling: metric: rps target: 1 ``` ### Actual behaviour Until the first request hits...

ux
no-stale

Currently, dstack Models work with Chat Completions API, but since March OpenAI has introduced the Responses API ([migration guide](https://platform.openai.com/docs/guides/migrate-to-responses)). OpenAI says that the Chat Completions API will not be deprecated,...

feature
no-stale

### Problem If a spot instance is interrupted, `dstack` will only detect the interruption after a period of instance being unreachable: - If the job is running, it will be...

ux
no-stale
enhancement

### Problem It would be nice if we had the ability to simply restart a run ### Solution If given the run name, get the last run ID and restart...

feature
no-stale

We recently debugged a case when running multiple server replicas led to high DB load, many active DB sessions, and extremely slow DB queries. This turned out to be caused...

internals
no-stale
performance

Learn more: https://cloud.google.com/blog/products/compute/introducing-dynamic-workload-scheduler I suppose we need to support both flex and calendar modes.

major

### Problem Updating any fleet properties requires stopping all runs that are using the fleet and recreating it, including recreating the underlying VMs in cloud fleets. ### Solution Support in-place...

feature
ux
ssh-fleets
fleets
no-stale
cloud-fleets