yocto-gl
yocto-gl copied to clipboard
Support for Multi-Tenancy
Hello All,
I have an MLFlow Instance deployed on an Openshift cluster where the backend store is running as a PostGreSQL Instance and the Artifact Store is taken as a bucket in our Rook Ceph cluster..
I have two questions to ask:
- First, is it possible to add Multi-tenancy to a single instance of MLFlow so as to filter Experiments, Backend Store and Artifacts Store on the basis of User Authentication.
- And, if yes, can you please point me to any documentation, instructions or manuals from where I can get more info on how can I achieve this?
This is my issue as well -- I'm seeing how I could setup some sort of vouch-proxy that updates the headers per user, use independent oauth2 tokens, or share a single token for all users to share, but I'm particularly worried about how to host vanilla mlflow in a way that secure access from multiple sites can be maintained securely for multiple users using Python. I'm not even worried about have separate user state, so much as multiple oauth2 tokens, better nginx/HTTP header integration, or something.
The approach I am considering is possibly to put a wrapper or decorator around most mlfliw api calls to init and switch db per team based on inputs. Others have already done basic auth with nginx around mlflow. If you use s3 as the artifact store and don't proxy artifacts through the tracking server, they upload there and at the same time that bucket name (i.e. username) and token are the credentials. The same auth is used for nginx and the bucket name can also be team name and db schema name. Then the wrapper or decorator in would either reinitialize the db and async wait to point to the right place before querying. Or it could reinit the whole flask app/server if easier with the new info.
That doesn't get to the ui yet. But I assume a similar approach could be taken. Also the ui can be a bit less finicky because it can run anywhere on demand and doesn't need to be up or accessible 24/7. Others have put it inside of jupyterhub. It could even just be run locally.
I'm not sure if it's all worth the work though. Separate team instances and auth are already easy to achieve. We just have to suck it up and handle the inefficiency of running hundreds of uis and tracking servers when one with multitenancy would do.
our company needs this feature so much
The implicit support of multitenant will make MLFlow to be more adopted at the Enterprise Level. Also, it will save more time for the DevOps and Integration perspective, by saving time to plumb and develop all the intricates of the authentication and authorizations. I definitely need this feature!
I suspect this is part of the point. The core app has been open sourced including v 2.x recently. But the enterprise parts are part of Databricks' product.
Looking for this feature to be implemented as open source as well.
Is there been any update on this ?