pipelines
pipelines copied to clipboard
[frontend] Artifact API Does not support S3 Regions
data:image/s3,"s3://crabby-images/65747/65747bd8d6482f01ca9c61aa5185bd0252ccdbd2" alt="Screenshot 2022-08-08 at 18 16 17"
data:image/s3,"s3://crabby-images/91e96/91e96f3f08ce25f761b103722b8175988be370e3" alt="Screenshot 2022-08-08 at 18 16 02"
Artifact fetching API does not seem to support region for s3 resources. If we look here, https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/minio-helper.ts#L50, the region parameter is not being passed and it is not a part of the artefact API as well. However, MINIO Client API allows you to specify region when creating a client: https://docs.min.io/docs/javascript-client-api-reference.html. I think API should fetch default region using MINIO_REGION
environment variable which should be passed as a part of ml-pipeline-ui-artifact
k8s deployment env
map.
Environment
-
How did you deploy Kubeflow Pipelines (KFP)? kubeflow-manifests
-
KFP version: 1.5.1
Steps to reproduce
- Launch any pipeline
- Go to Run UI
- Try to download any output
- Observe the error: Failed to get object in bucket [bucket] at path [path]: S3Error: The authorization header is malformed; the region 'us-east-1' is wrong; expecting '[actual bucket region]'
Expected result
Materials and Reference
My bad, I've found that it actually uses AWS_REGION
variable here: https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L129. Will add that to deployment to test if it works
It works as expected if I specify AWS_REGION
, AWS_SECRET_ACCESS_KEY
, and AWS_ACCESS_KEY_ID
Actually, I think that there is still an issue when we use multi-profile setup. The problem lies in the ml-pipeline-ui-artifact
deployment which is automatically created for each profile.
Here, https://github.com/kubeflow/pipelines/blob/a0a8f1da8cb7ca53cde7717aa78e666b634fec75/manifests/kustomize/base/installs/multi-user/pipelines-profile-controller/sync.py#L304 we sync only MINIO_*
variables. However, when using s3, this code suggests that we should provide AWS_REGION
, AWS_SECRET_ACCESS_KEY
, and AWS_ACCESS_KEY_ID
.
https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L129
That means that we should add those variables to pipelines-profile-controller
. When I add them manually to the ml-pipeline-ui-artifact
UI fetches all s3 artifacts in the Pipeline UI as expected.
Another possible solution would be to add MINIO_REGION
variable mapping here istead of modifying pipelines-profile-controller
https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L139. Although, I am not entirely sure that it will work.
Hi @kdubovikov @chensun,
Also running into this problem after switching Artifact Store persistence to S3.
Wondering if you've seen the latest MinioJS v7.0.27 with Assume Web Identity Role support https://github.com/minio/minio-js/pull/960 ?
This could be part of the fix, without needing any workaround if I'm not mistaken.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.