dstack icon indicating copy to clipboard operation
dstack copied to clipboard

[Meta] Improve `kubernetes` backend

Open un-def opened this issue 3 months ago • 1 comments

Essential:

  • [x] Request resources according to the dstack configuration
  • [x] Multi-node support (distributed tasks running on fleets with cluster placement)

Strategic:

  • [x] AMD GPUs support
  • [ ] Allow to configure multiple clusters per backend (e.g. per region)
  • [ ] Auto-scaling support (ideally, find a way to support it for any clouds)

Improvements:

  • [x] Update the jump pod: use a lightweight image, restrict SSH access (see TODOs in _create_jump_pod_service)
  • [x] Test and update (if required) the gateway functionality on managed/self-hosted Kubernetes other than EKS (see TODO in KubernetesCompute.create_gateway)

un-def avatar Sep 23 '25 14:09 un-def