[Feat] add user annotations to k8s objects
Why are the changes needed?
Presently when submitting workflows via Flyte, it's hard to know who submitted what. This PR addresses one of two concerns:
- I can't filter by users in the Flyte Console
- I can't see a user identifier in k8s
Why do I/we care about 2? Let's say you have two teams (A & B), if team A has access to 10 resources and team B has access to 1000 resources and we know who is in each team (exclusively). One can create further admission controls for any user in team A that requests more resources than are allotted to that team. Furthermore, this can be tracked properly with Kueue.
Today when I submit a Flyte workflow I have no way to actually know who submitted what from a k8s perspective because principal information is not passed to the downstream k8s flyteworkflow object.
What changes were proposed in this pull request?
This PR adds a parameter that enables a new annotation to be injected into each flyteworkflow object that contains the user principal, which usually carries oauth information. Additionally, it allows customization of the annotation prefix.
How was this patch tested?
I added some unit tests, and I also tested in production with the flyte-core chart on EKS and using Keycloak:
$ kubectl describe flyteworkflow -n flyte-workspace ax5hwpvbmwnkhdkmq67m | grep flyte.org
Annotations: flyte.org/user: [email protected]
Labels
Please add one or more of the following labels to categorize your PR:
- added: For new features.
- changed: For changes in existing functionality.
- deprecated: For soon-to-be-removed features.
- removed: For features being removed.
- fixed: For any bug fixed.
- security: In case of vulnerabilities
This is important to improve the readability of release notes.
Setup process
Screenshots
Check all the applicable boxes
- [x] I updated the documentation accordingly.
- [x] All new and existing tests passed.
- [x] All commits are signed-off.
Related PRs
Docs link
Summary by Bito
- This pull request introduces a feature for injecting user annotations into Flyte workflows, enhancing visibility and accountability in multi-team environments, while also addressing potential risks related to data integrity and security in Kubernetes environments.
- It updates the identity annotation prefix to 'flyte.org' and includes customization options for better tracking of user submissions.
- The changes improve error handling in the execution manager, ensuring better resource management and security.
- Configuration updates include new identity annotation fields and modifications to deployment configurations, aiding in resource management.
- Unit tests have been added to ensure the functionality works as intended, validating the new features.
- Overall, this update introduces new user annotation features, customization options, and enhancements in resource management and security.
Thank you for opening this pull request! 🙌
These tips will help get your PR across the finish line:
- Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
- Sign off your commits (Reference: DCO Guide).
Codecov Report
:x: Patch coverage is 87.87879% with 8 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 56.93%. Comparing base (99f2ddd) to head (2860a30).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| ...kg/runtime/interfaces/application_configuration.go | 0.00% | 8 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #6710 +/- ##
==========================================
+ Coverage 56.91% 56.93% +0.02%
==========================================
Files 929 929
Lines 58077 58139 +62
==========================================
+ Hits 33052 33100 +48
- Misses 21984 21998 +14
Partials 3041 3041
| Flag | Coverage Δ | |
|---|---|---|
| unittests-datacatalog | 53.51% <ø> (ø) |
|
| unittests-flyteadmin | 53.10% <87.87%> (+0.08%) |
:arrow_up: |
| unittests-flytecopilot | 43.06% <ø> (ø) |
|
| unittests-flytectl | 64.02% <ø> (ø) |
|
| unittests-flyteidl | 75.71% <ø> (ø) |
|
| unittests-flyteplugins | 60.13% <ø> (ø) |
|
| unittests-flytepropeller | 53.54% <ø> (ø) |
|
| unittests-flytestdlib | 63.29% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Hey @davidmirror-ops it looks like there's an issue with pulling helm on a bunch of the tests, is this expected?
I'm working on updating the code coverage to get things green.
@ttitsworth-lila yes, that's still a limitation on the current version of CI. Could you run make helm locally and push the changes?
@davidmirror-ops no problem, done.
We do something similar internally since the principal is not passed from the control plane to the data plane, and thus can't be rendered into the k8s manifests in propeller. I will review this more deeply later.
We do something similar internally since the principal is not passed from the control plane to the data plane, and thus can't be rendered into the k8s manifests in propeller. I will review this more deeply later.
@Sovietaced Thanks for the comment. I'd love to understand more about what you mean. Presently I'm testing this in a multi-cluster setup where my above method is able to pass that information. What IDP provider do you use internally?
@Sovietaced Thanks for the comment. I'd love to understand more about what you mean. Presently I'm testing this in a multi-cluster setup where my above method is able to pass that information. What IDP provider do you use internally?
Not sure what you mean with regards to IDP. What I'm trying to say is that flyte admin (the control plane) has the identity information when the user calls the API to create an execution. None of that information is passed down into the workflow resource and as a result is unusable by the flyte propeller (the data plane). So like you've done here we inject some of that information into the workflow annotations which isn't super elegant, but it works.
Arguably the workflow spec should embed some information about the user identity in a more structured way.
So like you've done here we inject some of that information into the workflow annotations which isn't super elegant, but it works. Arguably the workflow spec should embed some information about the user identity in a more structured way.
Like your suggestion @Sovietaced , but would require a lot more changes across the stack to include in the spec but the current PR can be considered incremental and adds a benefit for having identity info in the dataplane.
Thanks for the review @pmahindrakar-oss, I just pushed a change to improve the coverage. I'm having some issues running helm make:
Save error occurred: could not find : chart docker-registry not found in https://helm.twun.io/: looks like "https://helm.twun.io/" is not a valid chart repository or cannot be reached: Get "https://helm.twun.io/index.yaml": dial tcp: lookup helm.twun.io on 10.255.255.254:53: no such host
Hope this is acceptable, and would appreciate some help with landing this PR :)
Thanks for the review @pmahindrakar-oss, I just pushed a change to improve the coverage. I'm having some issues running
helm make:Save error occurred: could not find : chart docker-registry not found in https://helm.twun.io/: looks like "https://helm.twun.io/" is not a valid chart repository or cannot be reached: Get "https://helm.twun.io/index.yaml": dial tcp: lookup helm.twun.io on 10.255.255.254:53: no such hostHope this is acceptable, and would appreciate some help with landing this PR :)
The DNS issue looks like possible company VPN / firewall not resolving approved domains
The DNS issue looks like possible company VPN / firewall not resolving approved domains
fixed it here https://github.com/flyteorg/flyte/pull/6726
Got some conflicts
Bito Review Skipped - No Changes Detected
Bito didn't review this pull request because we did not detect any changes in the pull request to review.
@machichima Thanks for the review! I think I've addressed all of your comments.
cc @EngHabu @fg91 could you please also take a look at this? Thanks!
There already is a feature to inject the user identity as a label :)
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
https://github.com/flyteorg/flyte/pull/4637
This applies the label on the pod object though and I just realized you would like to have it on the flyteworkflow CR. Maybe we could reuse the same label though instead of adding new annotation with a different name?
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
#4637
This applies the label on the pod object though and I just realized you would like to have it on the flyteworkflow CR. Maybe we could reuse the same label though instead of adding new annotation with a different name?
Happy to consolidate to prevent duplication here, I'm just trying to understand how your PR works because I don't currently see anything related to labels.execution_identity, does this label only appear when you set the execution identity in the podtemplate?
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
#4637 This applies the label on the pod object though and I just realized you would like to have it on the flyteworkflow CR. Maybe we could reuse the same label though instead of adding new annotation with a different name?
Happy to consolidate to prevent duplication here, I'm just trying to understand how your PR works because I don't currently see anything related to
labels.execution_identity, does this label only appear when you set the execution identity in the podtemplate?
No, you don't need to set anything in the pod template 🤔 If I'm not completely mistaken, the only requirement is that flyteadmin needs to have auth enabled but I assume you have that?
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
#4637 This applies the label on the pod object though and I just realized you would like to have it on the flyteworkflow CR. Maybe we could reuse the same label though instead of adding new annotation with a different name?
Happy to consolidate to prevent duplication here, I'm just trying to understand how your PR works because I don't currently see anything related to
labels.execution_identity, does this label only appear when you set the execution identity in the podtemplate?No, you don't need to set anything in the pod template 🤔 If I'm not completely mistaken, the only requirement is that flyteadmin needs to have auth enabled but I assume you have that?
I went and verified that exact key in our values file, and yes we have auth enabled. Obviously we need the auth-flow to work for my changes to function as well. Chart flyte-core-v1.15.3. I'm wondering if we could get a third party like @machichima or @pmahindrakar-oss to help verify if this feature @fg91 introduced is functioning? I can only assume it's working for your team @fg91!
There already is a feature to inject the user identity as a label :)
@fg91 can you share the docs for that? Or just point me to where that is?
#4637
This applies the label on the pod object though and I just realized you would like to have it on the flyteworkflow CR. Maybe we could reuse the same label though instead of adding new annotation with a different name?
On the topic of consolidation, it's not obvious to me the best way to consolidate since the scope of this PR grew to include more than just the identity email. If we do anything, we could just keep both for backwards compatibility and let the community decide how to proceed. Thoughts?
I think this is mostly fine btw, but agreed with @Sovietaced - I feel like there should be a more cohesive story around how identity is injected. And also how it can be used by the various plugins after it's been injected into the wf CRD. Are email/sub the only things that will ever need to be injected? Will anyone ever want to inject an entire token (for further downstream auth by plugins themselves)?
I feel like there should be a more cohesive story around how identity is injected.
Not all pieces of an identity should be injected IMO. The user deploying Flyte should have to explicitly choose what IDP labels they will expose to their propeller env(s).
Are email/sub the only things that will ever need to be injected?
From my perspective, yes. This is now giving contributors a pathway to add new labels for their IDP provider if they have their own use-case.
Will anyone ever want to inject an entire token (for further downstream auth by plugins themselves)?
I had not considered this as a valid method to auth with plugins, it seems like there are security implications with that approach. I am also unaware as to what plugins would benefit from this approach.
@davidmirror-ops It looks like some of the tests are still failing with errors I'm unsure about, how do we want to proceed with this PR?
Bito Review Skipped - No Changes Detected
Bito didn't review this pull request because we did not detect any changes in the pull request to review.
@davidmirror-ops It looks like some of the tests are still failing with errors I'm unsure about, how do we want to proceed with this PR?
I think you need to make helm again and push. Also it seems like there's some lint error, you can run make lint locally to fix.
Seems like tests are passed.
Feel free to ping me after you finished, and I can re-trigger the workflow for you.
And I'll look into the PR that @fg91 mentioned, and see if we need to modify this PR to reuse the label. Thank you!
I've tested out the PR that @fg91 mentioned (https://github.com/flyteorg/flyte/pull/4637). I think it's a bit different from what we want to do here.
Update:
I can see the execution-identity label on my side, which only shows the subject. I think for showing different info other than subject (e.g. email), the current approach looks better for me.
What #4637 did is to add the Secret set in the task to annotations. I tested with following flytekit script:
from flytekit import Secret, task, workflow
@task(
secret_requests=[
Secret(group="testing-secret-1", key="username", mount_requirement=Secret.MountType.ENV_VAR),
Secret(group="testing-secret-2", key="api-token"),
]
)
def my_task_using_secrets():
pass
@workflow
def wf():
my_task_using_secrets()
And the pod created will have the label and annotation. The flyte.secrets/s0 value is the encoded string for the Secret setting we provided in the task
❯ k describe pod a7qswm5zm79mp6t25jz2-n0-0 -n flytesnacks-development
Name: a7qswm5zm79mp6t25jz2-n0-0
Namespace: flytesnacks-development
Priority: 0
Service Account: default
Node: cdb78fa419be/192.168.215.3
Start Time: Mon, 08 Dec 2025 15:54:30 +0800
Labels: domain=development
execution-id=a7qswm5zm79mp6t25jz2
inject-flyte-secrets=true # <- here
interruptible=false
node-id=n0
project=flytesnacks
shard-key=27
task-name=test-secret-my-task-using-secrets
workflow-name=test-secret-wf
Annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: false
# and following two flyte.secrets annotations
flyte.secrets/s0: m4zg54lqhiqce4dfon1gs2thfvzwky2smv1c1mjcbjvwk5j1earhk32fojxgc2lfeifg122vnz1f53tfof1ws3tfnvsw34b1ebcu3vs6kzavecq
flyte.secrets/s1: m4zg54lqhiqce4dfon1gs2thfvzwky2smv1c1mrcbjvwk5j1eargc3djfv1g512fnyrau
primary_container_name: a7qswm5zm79mp6t25jz2-n0-0
cc @fg91 to confirm if it's correct? In this case I think it's ok to keep this PR as is. WDYT?
@machichima I ran make helm just fine, but make lint didn't work. I did a bit of looking in the docs and files to see if I'm missing anything, but I just did a fresh clone to confirm that for whatever reason make lint fails on the current default branch:
$ make lint
Makefile:130: warning: overriding recipe for target 'go-tidy'
boilerplate/flyte/golang_test_targets/Makefile:59: warning: ignoring old recipe for target 'go-tidy'
make[1]: Entering directory '/home/user/flyte'
make[1]: *** /flytestdlib: No such file or directory. Stop.
make[1]: Leaving directory '/home/user/flyte'
make: *** [boilerplate/flyte/golang_test_targets/Makefile:10: download_tooling] Error 2
Editing the path unraveled the error a little bit more, but I'm just curious if I missed an obvious step. So I instead just ran golangci-lint directly and found some linting errors in the unrelated cmd directory.
INFO [runner] linters took 29.799047289s with stages: goanalysis_metalinter: 29.796747133s
cmd/single/console.go:29:6: type consoleFS is unused (unused)
type consoleFS struct {
^
cmd/single/console.go:35:20: func consoleFS.Open is unused (unused)
func (f consoleFS) Open(name string) (http.File, error) {
^
cmd/single/without_console_dist.go:11:5: var console is unused (unused)
var console fs.FS
^
3 issues:
* unused: 3