flyte
flyte copied to clipboard
[BUG] Copilot does not have access to IAM role auth
Describe the bug
Container Tasks only have access to the default bucket as configured in propeller whereas Python Tasks have access to anything permitted in the IAM role attached to the pod. This prevents copilot from downloading objects from any non-default buckets. Additionally, the copilot init container will exit successfully, despite logging a permission denied error.
Expected behavior
Copilot should have access to any buckets allowed in the IAM role. Those objects should be downloaded and made available to the ContainerTask. Copilot init container should fail upon access denied.
Additional context to reproduce
Create and run a raw container task using any S3 link apart from the default bucket.
Screenshots
No response
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
🤖
Hello @pryce-turner! 👋 I'm Dosu, a friendly bot designed to assist you with resolving bugs, answering questions, and guiding you to become a contributor. While we wait for a human maintainer, feel free to ask me anything related to the repository. I'm here to make your experience smoother! 😊
Thank you for reporting this issue. It seems like there's a discrepancy between the IAM role access for Container Tasks and Python Tasks in Flyte.
As per the current implementation, the IAM role authentication for Container Tasks is handled through the CoreIdentity
struct in the flyteadmin
package. This struct encapsulates the various security identities a task can run as. For IAM role authentication, the IamRole
field is used. This field references the fully qualified name of the Identity & Access Management role to impersonate. It's up to the underlying plugin to pick the right identity for the execution environment. You can find more details in the flyteidl/gen/pb-go/flyteidl/service/flyteadmin/model_core_identity.go file.
However, it seems like the copilot is not correctly picking up the IAM role for non-default buckets. This could be due to a bug in the copilot or the underlying plugin.
As for the copilot init container exiting successfully despite logging a permission denied error, this could be due to the error handling in the init container. It might be ignoring the permission denied error and exiting with a success status. This is definitely something that needs to be addressed.
Unfortunately, as a Dosu, I don't have the ability to create pull requests or issues. I would recommend you to create a new issue in the Flyte repository detailing this problem. Please include as much information as possible, including the steps to reproduce the issue, any error messages you're seeing, and any relevant configuration details. This will help the Flyte maintainers to better understand and fix the issue.
Sources
- flyteadmin/auth/authzserver/provider.go
- flyteidl/gen/pb-go/flyteidl/service/flyteadmin/model_core_identity.go
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Adding comment from Jeev: the issue is that copilot's mechanism does not allow for fetching objects outside of propeller's default bucket (passed through to copilot). it DOES have access to the IAM role auth.
Hey there, I wanted to catch up on the status of this issue. We are starting to increase the workloads going through this container task and are expecting production workloads on this system soon. We are encountering the same issue in another workflow as well where we would like to use a shell task. While we have a viable workaround, the pain increases with workloads as this workaround requires sending all the data twice more through the network and have one more full copy of the data in the system. Any update on status and timeline of this issue would be appreciated. Thanks a lot