REP: Ray authn/authz with Kubernetes RBAC
Some feedback I got from an internal security review at Google (cc @mtaufen, @vinayakankugoyal):
- Start with support for
audiencesin TokenReview API to avoid unintended use of tokens mounted into Ray containers - For Raylet identity, manage a separate token from the same ServiceAccount using projected volumes https://kubernetes.io/docs/concepts/storage/projected-volumes/#serviceaccounttoken. This avoids using the default token which is only intended for use with K8s API.
- Try to introduce finer-grain access control (read, write, etc) earlier as it can be difficult to retrofit this later
@sampan-s-nayak and @edoakes suggested updating this REP to include the fulll scope of token authentication, including the k8s integration. We'll update the REP once it's ready
Directionally, this feels reasonable to me.
I skimmed this when you first pinged me, and have hazy memories of this stipulating a lot more about RBAC in Ray? The reason I bring it up is that I think bridging the K8s permission model into Ray is an obvious win and good first step, I think attaching logical RBAC to Ray, whether driven by K8s or without it, is a pretty huge undertaking, especially given that workloads can be scheduled on privileged nodes. Does it make sense to break this up into two proposals, one dependant on the other?
Agreed that introducing RBAC in Ray is something worth pursuing. I don't think we need the same set of verbs as Kubernetes though (get, list, create, update, patch, delete, etc), however I think we can start with a more minimal set of verbs like read and write.
Does it make sense to break this up into two proposals, one dependant on the other?
I got feedback from Edward that we should update this enhancement to include the full scope of token authentcation, I can take a stab at also including a section on how we would introduce read/write verbs and how it would integrate with Kubernetes RBAC. If it gets too long we can break it into a separate proposal.
I think I misunderstood your last comment, but after speaking with Edward it seems like we should defer adding additional verbs into Ray for now and revisit it later.
That makes a lot of sense. I think to clarify where I'm coming from:
Making Ray awares of the k8s authentication primitives and wiring them up I think is obviously good and shouldn't be particularly controversial. I just saw your followup, but also I probably wrote RBAC where I meant fine grained authorization which likely muddled things up a little bit as well. It sounds like we're mostly in agreement.