yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

Feature request: Coarser level of organization in Tracking.

Open kaleko opened this issue 5 years ago • 11 comments

Right now it seems the coarsest level of organization within mlflow tracking is the "experiment".

My group is working on three completely separate types of tasks (for example, predicting housing prices, classifying images of cats, and clustering customers), and are therefore using three separate mlflow tracking servers. As their manager, I'd rather have one centralized server where the top level of organization is the project they are working on, then within each there are the experiments, and further there are the runs. Is this possible, or is this expected to be possible in future mlflow releases? I really think this would be a useful feature.

This comment got several upvotes in the mlflow slack channel, so I am creating an official feature request here.

kaleko avatar Jun 17 '19 14:06 kaleko

@kaleko sorry for the delayed response - out of curiosity is there a reason each project can't be encapsulated by a single MLflow experiment? Maybe the gap here is a UI/UX issue (e.g. maybe if it were easier to label runs & view runs by label your use case would be satisfied?)

smurching avatar Jul 26 '19 21:07 smurching

nice thread, some thoughts

@smurching don't think much make sense to force experiments as projects, feels like a hack & we lose the most natural aggregation (experiment)? we need another level I think (could be a hierarchical workspace/folder string, something like that, btw-what does databricks use?).

we're tending to single mlflow backend per project/initiative because we can't filter the UI by project easily and by default

@kaleko although global view seems interesting maybe we could deep dive a little bit on actual benefits/scenarios? Comparing runs between projects doesnt make much sense, I see some value on comparing stats (ex: nº runs, by whom, last run, etc) between projects, but that prob can be better handled by a BI tool after extracting run/exp info from all stores? Or only having a single starting home page?

Having a single backend/tracking server though would make things easier for maintenance and hosting.

rquintino avatar Jul 27 '19 23:07 rquintino

related in "Feature request: Multi-user support for tracking and UI #724" https://github.com/mlflow/mlflow/issues/724

"When accessing the UI, they pass the 'namespace' as a parameter and then the UI only show their own experiments."

rquintino avatar Jul 27 '19 23:07 rquintino

@smurching I think the natural structure is people work on different projects, each project has a number of different experiments in it, and each experiment may consist of a number of different runs. For an example, a project might be "classifying images that have cats in them", an experiment might be "trying to preprocess the image before inference", and runs might be "compressing the image", "blurring the image", "increasing contrast in the image", etc.

@rquintino For me, the main benefit would be in only having to host a single mlflow server. Each time I create a new one, I have to provision a machine, install packages, download docker, pull docker image, etc. I would prefer to have one server that everyone uses, everyone shares the same access credentials, and if someone changes and wants to spawn off a new project, they can do so without having to go to an admin and ask for a new mlflow server.

kaleko avatar Jul 29 '19 20:07 kaleko

thanks @kaleko we share same understanding of experiments & runs (btw- related, want to comment on this one? "Compare runs from different experiments " https://github.com/mlflow/mlflow/issues/1414 )

understand the hosting issue, makes sense. note, in case it helps, what we are doing right now is actually starting out a mlflow ui process per each docker jupyter dev environment, mlflow ui running through nbserverproxy (mlruns backend store being shared per project). Kind more a tool team members can use if they are working, but no centralized mlflow permanent server running for now.

rquintino avatar Jul 29 '19 21:07 rquintino

In the next version of MLflow (1.28.0), we're going to release the search_experiments API that should help search/discover experiments using names/tags.

harupy avatar Jul 21 '22 16:07 harupy

Hi, just curious to know if you are still thinking of creating a higher-level grouping of content which will be ability to group experiments under maybe a segregation called Projects?

NSiddanth avatar Oct 06 '23 11:10 NSiddanth

Yes, that would definitely be very valuable for us. We are still creating separate service instances for each and every project to avoid the clutter of seeing everything in one long list of experiments. For using a shared instance on enterprise level, another level on top of projects would also be desirable: teams/projects/experiments. Tags could be sufficient if filtering on top of tags would be supported well in all UI functionality. Especially, it should be possible to somehow persist filter settings across views and sessions on user level so that filtering down does not have to be repeated manually all the time.

BigNerd avatar Oct 06 '23 11:10 BigNerd

Hi @harupy and @smurching Thanks for all the information and adding the search functionality that will enable a better finding of experiments. I do think it would be very very helpful at an "enterprise" level to have a separation per Organization/Directory.

The assumptions mentioned in this thread where different people work on different projects therefore each project maps to an experiment is correct IMO but is missing the view from a larger organization where those people are not necessarily in the same team or department. Sometimes even for company policies the projects cannot be shared due to company policies, and other teams should not have access to other orgs.

For example, suppose there is a Supply Chain Data Science team, a Commercial/Ecomm Data Science team doing personalisation and a Robotics Data Science team, each of them belong to different departments, but the MLOps Engineering/Platform team is in IT (this is a very common pattern), now the IT platform team would of course benefit from maintaining a single MLFlow instance, but the different departments need to be segregated in the UI/UX, even if we forget about permissions on user managment and just the UI, you can imagine it can be a mess quite fast. In contrast it would be great if a user, recognized as part of a department can have a visualization of the different experiments at that level.

Take the example of Jenkins, if you would have a single UI showing all pipeline runs it would be a mess, instead there is the concept of Directories where different CI pipelines can be segregated. Of course Jenkisn does have user-managment based on permissions, but I think having these Department/Folder/Directory level would be already a big win.

Hope this helps clarify the usability of this feature across a company with different teams and departments. Thanks! Let me know what you think and if it is something that can be considered of if it is completely out of the table (either way would help us choose the correct tool for our efforts).

rragundez avatar Feb 16 '24 03:02 rragundez

Hi @harupy and @smurching Thanks for all the information and adding the search functionality that will enable a better finding of experiments. I do think it would be very very helpful at an "enterprise" level to have a separation per Organization/Directory.

The assumptions mentioned in this thread where different people work on different projects therefore each project maps to an experiment is correct IMO but is missing the view from a larger organization where those people are not necessarily in the same team or department. Sometimes even for company policies the projects cannot be shared due to company policies, and other teams should not have access to other orgs.

For example, suppose there is a Supply Chain Data Science team, a Commercial/Ecomm Data Science team doing personalisation and a Robotics Data Science team, each of them belong to different departments, but the MLOps Engineering/Platform team is in IT (this is a very common pattern), now the IT platform team would of course benefit from maintaining a single MLFlow instance, but the different departments need to be segregated in the UI/UX, even if we forget about permissions on user managment and just the UI, you can imagine it can be a mess quite fast. In contrast it would be great if a user, recognized as part of a department can have a visualization of the different experiments at that level.

Take the example of Jenkins, if you would have a single UI showing all pipeline runs it would be a mess, instead there is the concept of Directories where different CI pipelines can be segregated. Of course Jenkisn does have user-managment based on permissions, but I think having these Department/Folder/Directory level would be already a big win.

Hope this helps clarify the usability of this feature across a company with different teams and departments. Thanks! Let me know what you think and if it is something that can be considered of if it is completely out of the table (either way would help us choose the correct tool for our efforts).

Following this request as for my experience of usage this is a critical one. Without this feature the adoption ok MLflow can result clunky or impossible in almost every multiteam or multiproject context, enterprise or not.

MattiaGallegati avatar Feb 27 '24 21:02 MattiaGallegati

Hi there, also very interested in this feature? It is the main reason so far why I can't get our team to use MLFlow - because of the limitations in organisation of experiments and runs.

MeganBeckett avatar Apr 30 '24 07:04 MeganBeckett

Just adding another request to this; for visibility purposes we're defining each experiment as the lifecycle and successive training runs of one particular ML model, but it makes it very painful in the UI because we can only differentiate across projects by a mess of tags that must be set each time. The registry is even more frustrating to use in this regard

DavidSlayback avatar Jun 12 '24 14:06 DavidSlayback