dvclive icon indicating copy to clipboard operation
dvclive copied to clipboard

integrations: other ML tracking tools

Open dberenbaum opened this issue 3 years ago • 6 comments

Now that dvclive is decoupled from dvc, it should be pretty trivial to log data in formats expected by other ML tracking tools like mlflow. It would be great to establish a pattern for adding integrations to any other ML tracking tools, making dvclive an agnostic lightweight wrapper for any other ML tracking tools.

This approach has a few benefits:

  • Users who don't have dvc or don't want to use dvc integration still have options for visualizing their model progress.
  • Users can switch between different ML tracking tools by simply switching the integration they use. For example, specifying the integration might be as simple as dvclive.init(style="mlflow").
  • Less development resources spent on visualization in dvclive itself.

Tools to support for integration:

  • [ ] mlflow: https://www.mlflow.org/docs/latest/tracking.html#performance-tracking-with-metrics
  • [ ] tensorboard: https://www.tensorflow.org/tensorboard/scalars_and_keras
  • [ ] aimstack: https://aimstack.io/

dberenbaum avatar Mar 25 '21 18:03 dberenbaum

This might require first making the current dvc integration more of a general abstraction.

dberenbaum avatar Mar 25 '21 18:03 dberenbaum

For reference, comet.ml has a built-in integration with mlflow:

https://www.comet.ml/docs/python-sdk/mlflow/

It just requires adding import comet_ml to the script where mlflow is being used. It seems to capture existing mlflow.log* calls and "redirect" them to comet_ml.

In addition, they have an open source tool to "migrate" previous MLFlow experiments:

This extension will synchronize previous MLFlow experiment runs with all runs tracked with Comet's Python SDK with MLFlow support, for deeper experiment instrumentation and improved logging, visibility, project organization and access management.

daavoo avatar Jun 25 '21 08:06 daavoo

Also related are the slides from yesterday's meetup: https://docs.google.com/presentation/d/1TfChy39Xb6vKVvuMaWihZjbYO97CRnESqd6-xdmbQV4/edit#slide=id.p. The whole presentation should be up soon in https://discord.gg/STQyxbU6.

There was an example of using dvc experiments and then having a stage at the end to log all the results to wandb to visualize. We could build off that idea or document use cases like this as an alternative to building integrations.

There was also some discussion about how one perk of dvc is that it doesn't intrude on user code, which is something we should be mindful of with how we design dvclive. @daavoo I'd recommend taking a look when you have a chance, although it can wait a few weeks.

dberenbaum avatar Jun 25 '21 13:06 dberenbaum

Also related are the slides from yesterday's meetup: https://docs.google.com/presentation/d/1TfChy39Xb6vKVvuMaWihZjbYO97CRnESqd6-xdmbQV4/edit#slide=id.p. The whole presentation should be up soon in https://discord.gg/STQyxbU6.

There was an example of using dvc experiments and then having a stage at the end to log all the results to wandb to visualize. We could build off that idea or document use cases like this as an alternative to building integrations.

There was also some discussion about how one perk of dvc is that it doesn't intrude on user code, which is something we should be mindful of with how we design dvclive. @daavoo I'd recommend taking a look when you have a chance, although it can wait a few weeks.

I will take a look to the video

daavoo avatar Jul 12 '21 08:07 daavoo

Added https://aimstack.io/ to the list

daavoo avatar Oct 08 '21 11:10 daavoo

Added https://aimstack.io/ to the list

Conversions from TensorBoard and MLFlow https://aimstack.readthedocs.io/en/latest/quick_start/convert_data.html

daavoo avatar Apr 07 '22 17:04 daavoo