pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[frontend] Artifact API Does not support S3 Regions

Open kdubovikov opened this issue 2 years ago • 4 comments

Screenshot 2022-08-08 at 18 16 17 Screenshot 2022-08-08 at 18 16 02

Artifact fetching API does not seem to support region for s3 resources. If we look here, https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/minio-helper.ts#L50, the region parameter is not being passed and it is not a part of the artefact API as well. However, MINIO Client API allows you to specify region when creating a client: https://docs.min.io/docs/javascript-client-api-reference.html. I think API should fetch default region using MINIO_REGION environment variable which should be passed as a part of ml-pipeline-ui-artifact k8s deployment env map.

Environment

Steps to reproduce

  1. Launch any pipeline
  2. Go to Run UI
  3. Try to download any output
  4. Observe the error: Failed to get object in bucket [bucket] at path [path]: S3Error: The authorization header is malformed; the region 'us-east-1' is wrong; expecting '[actual bucket region]'

Expected result

Materials and Reference

kdubovikov avatar Aug 08 '22 15:08 kdubovikov

My bad, I've found that it actually uses AWS_REGION variable here: https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L129. Will add that to deployment to test if it works

kdubovikov avatar Aug 08 '22 15:08 kdubovikov

It works as expected if I specify AWS_REGION, AWS_SECRET_ACCESS_KEY, and AWS_ACCESS_KEY_ID

kdubovikov avatar Aug 08 '22 15:08 kdubovikov

Actually, I think that there is still an issue when we use multi-profile setup. The problem lies in the ml-pipeline-ui-artifact deployment which is automatically created for each profile.

Here, https://github.com/kubeflow/pipelines/blob/a0a8f1da8cb7ca53cde7717aa78e666b634fec75/manifests/kustomize/base/installs/multi-user/pipelines-profile-controller/sync.py#L304 we sync only MINIO_* variables. However, when using s3, this code suggests that we should provide AWS_REGION, AWS_SECRET_ACCESS_KEY, and AWS_ACCESS_KEY_ID.

https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L129

That means that we should add those variables to pipelines-profile-controller. When I add them manually to the ml-pipeline-ui-artifact UI fetches all s3 artifacts in the Pipeline UI as expected.

Another possible solution would be to add MINIO_REGION variable mapping here istead of modifying pipelines-profile-controller https://github.com/kubeflow/pipelines/blob/e8abec24fed4c4f8be6f527207b1cec9811ce3e7/frontend/server/configs.ts#L139. Although, I am not entirely sure that it will work.

kdubovikov avatar Aug 10 '22 16:08 kdubovikov

Hi @kdubovikov @chensun,

Also running into this problem after switching Artifact Store persistence to S3.

Wondering if you've seen the latest MinioJS v7.0.27 with Assume Web Identity Role support https://github.com/minio/minio-js/pull/960 ?

This could be part of the fix, without needing any workaround if I'm not mistaken.

rawc0der avatar Sep 01 '22 13:09 rawc0der

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 06 '24 07:03 github-actions[bot]