skypilot
skypilot copied to clipboard
[Serve] Change back from autodown to autostop
#3377 changes the autostop for skyserve controller to autodown, which will teardown the controller when the sky serve controller job exited unexpectedly and remove any related replica information/logs. This PR changes it back to autostop to preserve the info.
Tested (run the relevant ones):
- [ ] Code formatting:
bash format.sh
- [ ] Any manual or new tests for this PR (please specify below)
- [ ] All smoke tests:
pytest tests/test_smoke.py
- [ ] Relevant individual smoke tests:
pytest tests/test_smoke.py::test_fill_in_the_name
- [ ] Backward compatibility tests:
conda deactivate; bash -i tests/backward_compatibility_tests.sh
For discussion, this seems fine to stop serve controller, since we have made the controller on kubernetes to skip the autostop configuration in #3521. Wdyt @romilbhardwaj?
@cblmemo please check #3521 #3524 #3525, we need to make sure the serve controller will skip the autostop setting to allow the serve controller to run on kubernetes
Yes, should be okay if we skip autostop similar to #3521. So IIUC, the behavior is going to be:
Serve and jobs controller:
- On k8s - run indefinitely
- On other clouds - autostop after configured time
Let's go with this for now and reevaluate the "run indefinitely" based on user feedback.
bumping for this @Michaelvll @romilbhardwaj - are there any changes I need to make for this PR? IIUC it will automatically skip the autostop for k8s controller?