Flyte Project isolation
Discussed in https://github.com/flyteorg/flyte/discussions/5084
Originally posted by robert-ulbrich-mercedes-benz March 21, 2024 Motivation
Flyte allows having multiple tenants by using projects and domains. Flyte also comes with an authentication that only allows access to Flyte if the user is successfully authenticated. Once the user is successfully authenticated, she has full access to all projcets and domains in Flyte e.g. submit a workflow in any Flyte project.
Functional description
In order to protect data from unauthorized access, it makes sense to isolate the Flyte projects from each other. That way a user can only submit workflows in the project he or she is assigned to. If trying to register a launchplan, submitting a Flyte workflow or a task in a project the user is not assigned to, will cause an error.
Implementation idea
The projects a user is assigned to, will be identified by reading a claim from an identity token that is issued in the course of the authentication process. The claim to be read for this purpose is configured in Flyte's configuration file. Also a mapping between the claim's value and the actual project to be assigned to is maintained in the Flyte configuration.
Limits
So this approach only aims on isolating single Flyte tenants from each other and there is no intention to introduce a full blown IAM concept including user roles. So any member of a Flyte project still has full access to their project's launch plans data etc.
Hey there, @robert-ulbrich-mercedes-benz! Great to see you diving into another interesting challenge with Flyte. Let's see what we can do about that project isolation setup.
Based on the information you've provided and the context from the Flyte repository, it seems like you're looking to implement project-level isolation in Flyte using authentication claims. This is a complex task that involves both authentication (AuthN) and authorization (AuthZ) aspects.
Currently, Flyte supports OAuth2 based Authentication flow, which can be configured using the auth section in the Flyte configuration file. This includes setting up the OAuth2 client parameters such as ClientID, RedirectURI, Scopes, and Audience ^1^. However, this only handles the authentication part, i.e., verifying the identity of the user.
For authorization, i.e., determining what resources a user can access, Flyte does not currently have a built-in mechanism for mapping authentication claims to specific projects. The namespace_mapping section in the configuration file allows for namespace mapping based on project and domain, but it does not directly map authentication claims to projects ^2^.
There has been discussion around implementing Authorization (AuthZ / Role-based access control) in Flyte, with consideration for using Open Policy Agent for this purpose [^3^]. However, as of now, there is no standard for authZ available in open source, and the issue will remain closed until a workable RFC is proposed.
In the meantime, you might need to implement a custom solution for this. One approach could be to integrate your authentication provider with Kubernetes RBAC, using the authentication claims to determine the user's permissions and mapping these to specific projects and domains based on your organization's policies. However, this would likely involve additional logic at the application level or within the authentication/authorization flow.
I hope this information helps. If you have further questions or need more specific guidance, please provide additional details about your setup and requirements.
[^3^]: [Feature] Access control on who can run/view workflows
Sources
- Flyte Project isolation
- [FlyteCTL Feature] Restrict access to users on specific projects
- [Feature] Access control on who can run/view workflows
- flyteadmin/flyteadmin_config.yaml
- charts/flyte-binary/gke-starter.yaml
- flyteadmin/auth/config/third_party_config.go
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
:wave: @robert-ulbrich-mercedes-benz
Glad to see you picking this up. This has long been a need of ours as well, but we have no capacity in the foreseeable future to tackle the issue. We are handling the limitation by having separate Flyte environments (and clusters) per access boundary, which is not great.
I did want to bring up one facet around enforcing project-level access boundaries, which is how Flyte handles getting data to/from tasks via the data plane. Today, a sidecar on each Flyte task is responsible for communicating with the user & metadata storage containers. In most configurations, the entire task pod will have access to the credentials used to access the storage bucket. This would need to be considered to make any access boundary enforced so that a clever task author doesn't just pickup the necessary credentials for the shared containers and access another project's data.