skypilot icon indicating copy to clipboard operation
skypilot copied to clipboard

Ability to run SkyPilot Tasks locally

Open romilbhardwaj opened this issue 2 years ago • 4 comments

Using SkyPilot requires access to one or more supported cloud providers. This is a big barrier to entry for users who don't currently have access to clouds, those who work with other clouds or are just curious about SkyPilot.

It would be nice to have a local mode[^1] which allows users to run SkyPilot tasks on their local machine. #968 provides a prototype of this vision - it creates a ssh-able docker container to serve as a VM that SkyPilot on-prem can connect to. We should look into having the proposed sky local up/down CLI as a native feature.

Naturally, there are lots of design/engineering choices to be made:

  1. Interface - Do we want to expose the notion of a cluster when launching locally? If so, can we have something better than sky local up/down to manage the local cluster?
  2. Storage - rsync file mounts should work, but should we prohibit S3/GCS mounting?
  3. Accelerators - we need a pipeline in the #968 code to support local GPUs (docker-compose should use nvidia-docker runtime)
  4. num_nodes - We can support simulating multi-node local clusters (by creating multiple local containers), but maybe good to start with single node
  5. We need support for specifying SSH port number in Sky onprem.

[^1]: LocalDockerBackend aimed to achieve this, but does not have feature parity with the ray backend.

romilbhardwaj avatar Sep 12 '22 19:09 romilbhardwaj

I brought #968 up to date with master here! I squashed everything into one commit (but I can restore the history if you prefer).

What would be a good next step?

ewzeng avatar Oct 04 '22 03:10 ewzeng

Awesome work @ewzeng! We should now look at how we can move some of the local directory's components to either our code or to ~/.sky/. There are a few things I anticipate doing here:

  1. We should move the container creation and deletion logic from local/setup.sh and local/cleanup.sh to cli.local_up and cli.local_down (perhaps define some methods in backend_utils.py and call subprocess.run() there to replicate the shell scripts?).
  2. Instead of using the ssh key hardcoded in local/.env, we should look at using ~/.ssh/sky-key for SSH authentication to the container
  3. Add a method to automatically build the container image (and its base image) if they are not present, and add invoke it in cli.local_up()

There might be more things to do. The idea is to make sky local up and sky local down work generally on any user's laptop which has sky and docker installed.

romilbhardwaj avatar Oct 04 '22 04:10 romilbhardwaj

Sounds good! I will start working on this.

ewzeng avatar Oct 04 '22 17:10 ewzeng

Perfect @ewzeng. It is good to get Sky On-prem and local mode working in tandem (i.e. local mode using Sky onprem as backend).

michaelzhiluo avatar Oct 05 '22 22:10 michaelzhiluo

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar May 25 '23 02:05 github-actions[bot]

This issue was closed because it has been stalled for 10 days with no activity.

github-actions[bot] avatar Jun 04 '23 02:06 github-actions[bot]

This is now supported with sky local up.

romilbhardwaj avatar Mar 20 '24 23:03 romilbhardwaj