containers icon indicating copy to clipboard operation
containers copied to clipboard

[bitnami/mlflow] support google-cloud-storage

Open dhrp opened this issue 1 year ago • 9 comments

Description of the change

Updates the Bitnami MLFlow image to contain the google-cloud-storage pip module.

Benefits

This allows MLFlow to use the (built in) support for working with google cloud storage for storing and retrieving artifacts from google cloud storage. It is significant to run the MLFlow tracking server in this mode with google storage; but is also useful when using the MLFlow container in client mode.

Possible drawbacks

None that I know; though I don't know if adding a pip install at the end is the approach desired by Bitnami. • An alternative approach would be to add it to the mlflow stacksmith tarball; but AFAIK I cannot contribute to that. Will let that to the maintainers.

Applicable issues

Add support onto the container: fixes: https://github.com/bitnami/containers/issues/65108 Add support to google cloud storage in the MLFlow chart: Relates to https://github.com/bitnami/charts/issues/22720

Additional information

I'm happy to change approach if directed to how.

dhrp avatar May 25 '24 07:05 dhrp

I'm chasing one weird issue that in some server configurations the mlflow client fails to download the artifact from google storage directly with a permission error; even though the server has access to the artifacts just fine; and upload also works.

dhrp avatar May 28 '24 13:05 dhrp

Hi @dhrp

Thanks so much for this contribution!

We're currently evaluating the impact of including this Python module which seems to increase the image size by 16MB. We need to decide whether it's widely used or not before including it since we want the image to include only the most important modules and ask users to extend the image adding their custom ones for less important use cases.

In case we decide to accept it, please note it won't be included in the image using the "pip install" directive you proposed, but as part of the mlflow-2.13.0-0-linux-${OS_ARCH}-debian-12.tar.gz tarball added below:

  • https://github.com/bitnami/containers/blob/main/bitnami/mlflow/2/debian-12/Dockerfile#L32

We'll keep you updated about any decision we take.

juan131 avatar May 30 '24 06:05 juan131

Hi @juan131, ok; thanks for your message.

It seems that at least two other people commented, and 4 👍 on my issue on the MLFlow Helm chart that they would like to have the feature. See: https://github.com/bitnami/charts/issues/22720

Also: is there any way to see or contribute to how these tarballs are created?

dhrp avatar May 30 '24 11:05 dhrp

Hi @dhrp

Thanks for the insights, I'll share with the team.

Also: is there any way to see on contribute to how these tarballs are created?

I'm afraid the compilation recipes we use to build Bitnami assets are internal. We may consider moving them to some public repo since there's nothing to hide on them.

juan131 avatar May 31 '24 06:05 juan131

This Pull Request has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thank you for your contribution.

github-actions[bot] avatar Jun 16 '24 01:06 github-actions[bot]

not stale

juan131 avatar Jun 21 '24 13:06 juan131

Can we move this forward? I personally think the amount of 👍 (now 6) on the MLflow chart depending on this change is sufficient to make this change - if it's only a 16Mb size increase. -- As I see it storing models on a durable storage really is a primary feature of MLFlow.

dhrp avatar Jun 25 '24 09:06 dhrp

Hi @dhrp

I'm glad to confirm we got the "green light" to include this module by default in the image. I'm applying the required changes right now and I'll ping you once we released a new container image version including it.

juan131 avatar Jun 28 '24 09:06 juan131

Hi @dhrp

Please give it a try using the image tag 2.14.1-debian-12-r1

juan131 avatar Jun 28 '24 11:06 juan131

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Pull Request. Do not hesitate to reopen it later if necessary.

github-actions[bot] avatar Jul 04 '24 01:07 github-actions[bot]

I'm closing this PR given we included the missing pip module in 2.14.1-debian-12-r1 revision, please reopen it if you require further assistance.

juan131 avatar Jul 15 '24 07:07 juan131

HI @juan131, somehow didn't see this until now! Thanks so much of moving this forward. Very happy about it!

dhrp avatar Jul 21 '24 13:07 dhrp