charts icon indicating copy to clipboard operation
charts copied to clipboard

[bitnami/mlflow] MLFlow: Support additional artifact storage providers beyond S3

Open dhrp opened this issue 1 year ago • 9 comments

Name and Version

bitnami/mlflow 0.6.0

What is the problem this feature will solve?

The MLFlow chart (and docker image?) do not support using google cloud storage (or Microsoft cloud storage) for storing artifacts; while MLFlow itself does support it.

Supporting this will make it easer to integrate MLFlow artifacts into the cloud provider storage.

We are coming from the https://community-charts.github.io/helm-charts mlflow helm chart, which does have support for it (but is otherwise outdated).

What is the feature you are proposing to solve the problem?

What I propose is that we extend the Helm chart configuration to allow the user to choose a cloud storage type, and corresponding configuration.

I think that the docker image can just be adjusted to also install the google-cloud-storage azure-storage-blob and azure-identity packages. I cannot see the build details, and so cannot see if these are not already installed; but I guess they are not..

See: https://mlflow.org/docs/latest/tracking/artifacts-stores.html#google-cloud-storage

What alternatives have you considered?

The alternative that I have implemented for ourselves right now is to use the XML HMAC api of google cloud, to make it compatible with MLFlow; and it works; but it would be easier for many people if support is added to MLFlow.

https://github.com/bitnami/charts/pull/22699

dhrp avatar Jan 25 '24 12:01 dhrp

Hi @dhrp

I think what you propose makes sense. We are open to PR contributions, will you be interested in implementing the changes? We can work on our side to include the necessary packages in the docker-image 😁

joancafom avatar Jan 29 '24 12:01 joancafom

Hi @joancafom, I'm interested. Will need to find some time; but yes.

dhrp avatar Feb 05 '24 08:02 dhrp

I have created an internal issue to evaluate it when we have some time as well :)

joancafom avatar Feb 06 '24 17:02 joancafom

hey @joancafom , @dhrp - any updates on this feature? it would be useful for us too. happy to help with a PR as well.

majamil16 avatar Feb 20 '24 15:02 majamil16

Hi @majamil16

No, unfortunately we didn't have time to look into this issue yet. This feature both changing the image to include the components (which we might need to evaluate for feasibility) and the chart itself to add support for new providers.

joancafom avatar Feb 21 '24 16:02 joancafom

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Mar 08 '24 01:03 github-actions[bot]

Would be very interested in this feature too. Surprised that this is not already included... I guess not previously tried out for GCP. Blocked by this at the moment.

RussellSB avatar Mar 13 '24 11:03 RussellSB

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

carrodher avatar Mar 13 '24 12:03 carrodher

Hi everyone! Could you please give it a try using the image tag 2.14.1-debian-12-r1?

helm install mlflow oci://registry-1.docker.io/bitnamicharts/mlflow --set image.tag=2.14.1-debian-12-r1

We included the missing Python module on this image revision.

juan131 avatar Jun 28 '24 11:06 juan131

Thanks @juan131! I was reading this issue while having the same problem while trying to use Azure Blob Storage instead of S3. I was very happy to see your comment with the new tag. I tested the tag on our side, but it seems it does not yet add support for azure.

I assume you have only added the support for GCS but I was too excited.

As @dhrp already mentioned, we'd need azure-storage-blob to be able to use Azure Blob Storage.

hatemhamad avatar Jul 05 '24 10:07 hatemhamad

Hi @hatemhamad

You're right, we added support fort GCS but we don't have it for Azure Blob Storage by default yet. Could you please create a specific issue requesting this particular module? We'll evaluate the impact on size and vulnerability surface and respond you back on it.

juan131 avatar Jul 15 '24 07:07 juan131

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Jul 31 '24 01:07 github-actions[bot]

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

github-actions[bot] avatar Aug 05 '24 01:08 github-actions[bot]