containerized-data-importer icon indicating copy to clipboard operation
containerized-data-importer copied to clipboard

CDI importer requires static credentials for S3 and GCS

Open emmanuel opened this issue 1 year ago • 17 comments

Is your feature request related to a problem? Please describe: Workloads should not have access to long-lived credentials. In cloud environments in particular, there are mechanisms for distributing short-lived credentials and SDKs support fetching and utilizing such creds. CDI prevents using the security best practice of using short-lived credentials for image retrieval from S3 and GCS sources.

Describe the solution you'd like: CDI's operator should provide a method for specifying the image importer pod's ServiceAccount. This would enable access to existing credential distribution mechanisms: AWS's IRSA credentials as well as GKE Workload Identity. Additionally, the importer will need to be updated to rely on "ambient" credentials (retrieved by the cloud provider SDK), instead of the importer's current hard-coded reliance on static credentials. This could be accomplished by extending the S3DataSource and GCSDataSource structs with a serviceAccountName member, updating the relevant function signatures, and then branching when creating the client.

Describe alternatives you've considered: Currently have created a specific IAM user, issued an access key ID and secret key, and using a set of hard-coded creds in a Kubernetes secret referenced by the importer. It "works", at the cost of violating existing security approaches for workload authentication.

emmanuel avatar Dec 05 '23 22:12 emmanuel