skypilot icon indicating copy to clipboard operation
skypilot copied to clipboard

[Sky Onprem] Features and Fixes for Sky Onprem

Open michaelzhiluo opened this issue 2 years ago • 0 comments

Features

  • [ ] sky admin status - New CLI command that lists all user jobs for the administrator
  • [ ] Add local cluster spillover to multiple clouds
  • [ ] When registering a local cluster, make it file-based (no need to run sky launch -c [LOCAL_CLUSTER] '')
  • [ ] Make Sky installation from admin perspective only for the root account (or even better, remove Sky installation completely for admin)
  • [ ] Sky YAML script to automatically install Ray/Sky and create local cluster on a public cloud

Fixes

  • [x] Add schema check for cluster config yaml #1044
  • [ ] Detect more GPU resource types for detecting accelerators on the local cluster
  • [x] Refactor local cluster methods out of cli.py #1043
  • [ ] Make local cluster job submission logic only switch users once (in sky_app_[ID].py)
  • [x] Fix potential python conflict between Ray cluster's python (root user's python) and user's python #1030
  • [x] When sky queue [CLUSTER] is specified, do not output Local cluster is not initialized... #1038

Other Features mentioned in #801 (lower priority):

  • [ ] sky admin useradd - CLI command to add and remove users on the local cluster
  • [ ] sky admin clean - Recover from a bad state if onprem cluster breaks

michaelzhiluo avatar Jul 27 '22 22:07 michaelzhiluo