dstack icon indicating copy to clipboard operation
dstack copied to clipboard

[UX] Do not terminate unreachable instances in SSH fleets

Open peterschmidt85 opened this issue 8 months ago • 1 comments

Steps to reproduce:

  1. Create an SSH fleet with one or more hosts
  2. Disable the connection between the dstack server and the hosts of the fleet
  3. Wait for some time
  4. Restore the connection

Actual behaviour:

  1. Fleet instances become unreachable
  2. After some time (to be checked), the dstack server terminates instances (along with the jobs running if any)

Expected behaviour:

  1. Fleet instances remain unreachable while the connection is not working
  2. As soon as the connection is back, instances go back to idle
  3. The fleet is not deleted

Note:

  • The job running on an unreachable instance should still be terminated it is now.

peterschmidt85 avatar Apr 17 '25 16:04 peterschmidt85