skypilot icon indicating copy to clipboard operation
skypilot copied to clipboard

[Core][AWS] Allow specification of Security Groups for resources.

Open JGSweets opened this issue 1 year ago • 3 comments

Similar to https://github.com/skypilot-org/skypilot/pull/3488 but for SGs This PR:

  • Allows specification of SGs for controllers vs workers.
  • Currently removes the port requirement to be able to allow SG specification in serve.

Addresses: https://github.com/skypilot-org/skypilot/issues/3473

In ~/.sky/config.yaml:

  • When setting for a controller wildcard and all other resources ("*")
aws:
  security_group_name:
    - sky-serve-controller-*: skypilot-controller-fake-sg
    - "*": skypilot-default-fake-sg 
  • When setting for everything by specifying a default ("*")
aws:
  security_group_name:
    - "*": skypilot-default-fake-sg
  • When setting for everything by specifying just the security_group_name via string
aws:
  security_group_name: skypilot-default-fake-sg

Tested (run the relevant ones):

  • [x] Code formatting: bash format.sh
  • [x] Any manual or new tests for this PR (please specify below)
    • tested running in all 3 configs in AWS
      • Nothing set
      • setting different controller and default values (controller vs replica in serve received different remote_identities)
      • setting default only
  • [ ] All smoke tests: pytest tests/test_smoke.py
  • [ ] Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • [ ] Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

EDITED:

  • refactored to match the IAM format.

JGSweets avatar May 01 '24 16:05 JGSweets

@Michaelvll @concretevitamin do you have any suggestions for the concerns I raised? Thanks!

JGSweets avatar May 08 '24 17:05 JGSweets

@Michaelvll @cblmemo I've updated the PR with the ClusterName refactor.

Do we want to further refactor provisioners? If so, should we do it in this PR or a subsequent PR?

JGSweets avatar May 13 '24 01:05 JGSweets

@Michaelvll Could you please re-review this PR? Thanks!

JGSweets avatar Jun 24 '24 18:06 JGSweets

Seems we have missed one place below: https://github.com/JGSweets/skypilot/blob/22221f7ec7a4270c6308c256b10d2068c06a14be/sky/execution.py#L58-L59

It should be the following instead?

    resources_utils.ClusterName(clone_disk_from,
            handle.cluster_name_on_cloud)

Michaelvll avatar Jul 10 '24 18:07 Michaelvll

@Michaelvll Good find! Not sure how I missed that. I've fixed it.

JGSweets avatar Jul 10 '24 19:07 JGSweets

Updated tests passed as well.

JGSweets avatar Jul 10 '24 20:07 JGSweets

Sorry for the delay @JGSweets! The tests are all passed. Merging it now. Thank you for your contribution!

Michaelvll avatar Jul 11 '24 19:07 Michaelvll