Kubeflow unable to correctly show URL when deployed in public cloud
Bug Description
We deployed the kubeflow with S3 as the storage layer.
Minio is configured as below:
spec:
containers:
- args:
- gateway
- s3
- http://s3.eu-west-1.amazonaws.com
- --console-address
- :9001
We are unable to reach the artifact logs unless we manually edit the url from minio:// to s3://
To Reproduce
- Deploy kubeflow bundle with Minio configured as gateway mode in AWS.
- Run any pipeline and the artifact has
minio://instead ofs3://
Environment
Kubeflow bundle running in AWS cloud.
Relevant Log Output
N/A
Additional Context
No response
Thank you for reporting your feedback to us!
The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-8165.
This message was autogenerated
Hello @honghan-wong ! I tried to reproduce your issue:
Reproduction
- Deployed CKF
1.10/stableusing the Terraform module - Installed MicroCeph with:
sudo snap install microceph
sudo microceph cluster bootstrap
sudo ceph -s
sudo microceph disk add loop,4G,3
sudo ceph -s
sudo microceph enable rgw --port 7480
USER=manos
sudo radosgw-admin user create --uid=$USER --display-name=$USER
sudo radosgw-admin key create --uid=$USER --key-type=s3 --access-key=foo --secret-key=bar
sudo radosgw-admin user list
- Created a bucket named
mlpipelinein accordance with the default value for object-store-bucket-name inkfp-api:
sudo apt-get install s3cmd
cat > ~/.s3cfg <<EOF
[default]
access_key = foo
secret_key = bar
host_base = dev:7480
host_bucket = dev:740/%(bucket)
check_ssl_certificate = False
check_ssl_hostname = False
use_https = False
EOF
s3cmd mb -P s3://mlpipeline
- Configured
minioto user the above values:
juju config minio secret-key=$SECRET_KEY
juju config minio access-key=$ACCESS_KEY
juju config minio gateway-storage-service=s3
juju config minio storage-service-endpoint=http://$(hostname):7480
juju config minio mode=gateway
- Entered the CKF dashboard and ran the
[Tutorial] Data passing in python componentspipeline. The pipeline ran succesfully - The output looks like this:
- Clicking the link successfully downloads the artifact. Clicking the
view allbutton also shows the artifact. - The artifact is reachable with
s3cmd ls s3://mlpipeline/v2/artifacts/tutorial-data-passing/dbbf67a6-8290-4025-958e-e792840380b8/train/826a1884-7e30-454d-9951-3c6bb0247b5f/model - It is, however, not reachable with
s3cmd ls minio://mlpipeline/v2/artifacts/tutorial-data-passing/dbbf67a6-8290-4025-958e-e792840380b8/train/826a1884-7e30-454d-9951-3c6bb0247b5f/model
Exploration
- To see if this is an issue on our rock, I retried the same process but using the upstream image:
ghcr.io/kubeflow/kfp-frontend:2.5.0. This link text is again the same. - I Edited the
ARGO_ARCHIVE_ARTIFACTORYvalue in pebble_components.py tohellothere, packed the charm and refreshed it. The value didn't change.
Questions
- Are you able to download the artifacts by clicking on the link? Or is your issue regarding the
minio://URI that isn't accessible via thes3cmdCLI?
Additional Information
I received a description from someone experiencing this issue in their live environment.
They observed that all objects coming from minio:// were misbehaving. When they click on any component in a run, then click any links in the Output Artifacts section, they get a "Failed to get object in bucket: S3Error: The Access Key Id you provided does not exist in our records" message.
This is apparently happening to all runs in this particular environment.
Thanks for the additional context @LCVcode!
I think the underlying issue is this one https://github.com/canonical/kfp-operators/issues/449. So when MinIO was configured for GW mode, and the credentials got updated, they never got propagated to the user namespaces (in the K8s Secrets).
To confirm this, @LCVcode could you help us with the following:
- Does the K8s Secret
mlpipeline-minio-artifactin a user namespace have the same values as the config options in MinIO? - What are the pebble logs from MinIO's workload container?
- What are the logs from the
ml-pipeline-ui-artifactpod in the user namespace?
And then, could you try both the first two approaches we suggest in https://github.com/canonical/kfp-operators/issues/449?