dvclive
dvclive copied to clipboard
integrations: other ML tracking tools
Now that dvclive is decoupled from dvc, it should be pretty trivial to log data in formats expected by other ML tracking tools like mlflow. It would be great to establish a pattern for adding integrations to any other ML tracking tools, making dvclive an agnostic lightweight wrapper for any other ML tracking tools.
This approach has a few benefits:
- Users who don't have dvc or don't want to use dvc integration still have options for visualizing their model progress.
- Users can switch between different ML tracking tools by simply switching the integration they use. For example, specifying the integration might be as simple as
dvclive.init(style="mlflow")
. - Less development resources spent on visualization in dvclive itself.
Tools to support for integration:
- [ ] mlflow: https://www.mlflow.org/docs/latest/tracking.html#performance-tracking-with-metrics
- [ ] tensorboard: https://www.tensorflow.org/tensorboard/scalars_and_keras
- [ ] aimstack: https://aimstack.io/
This might require first making the current dvc integration more of a general abstraction.
For reference, comet.ml
has a built-in integration with mlflow
:
https://www.comet.ml/docs/python-sdk/mlflow/
It just requires adding import comet_ml
to the script where mlflow
is being used. It seems to capture existing mlflow.log*
calls and "redirect" them to comet_ml
.
In addition, they have an open source tool to "migrate" previous MLFlow experiments:
This extension will synchronize previous MLFlow experiment runs with all runs tracked with Comet's Python SDK with MLFlow support, for deeper experiment instrumentation and improved logging, visibility, project organization and access management.
Also related are the slides from yesterday's meetup: https://docs.google.com/presentation/d/1TfChy39Xb6vKVvuMaWihZjbYO97CRnESqd6-xdmbQV4/edit#slide=id.p. The whole presentation should be up soon in https://discord.gg/STQyxbU6.
There was an example of using dvc experiments and then having a stage at the end to log all the results to wandb to visualize. We could build off that idea or document use cases like this as an alternative to building integrations.
There was also some discussion about how one perk of dvc is that it doesn't intrude on user code, which is something we should be mindful of with how we design dvclive. @daavoo I'd recommend taking a look when you have a chance, although it can wait a few weeks.
Also related are the slides from yesterday's meetup: https://docs.google.com/presentation/d/1TfChy39Xb6vKVvuMaWihZjbYO97CRnESqd6-xdmbQV4/edit#slide=id.p. The whole presentation should be up soon in https://discord.gg/STQyxbU6.
There was an example of using dvc experiments and then having a stage at the end to log all the results to wandb to visualize. We could build off that idea or document use cases like this as an alternative to building integrations.
There was also some discussion about how one perk of dvc is that it doesn't intrude on user code, which is something we should be mindful of with how we design dvclive. @daavoo I'd recommend taking a look when you have a chance, although it can wait a few weeks.
I will take a look to the video
Added https://aimstack.io/ to the list
Added https://aimstack.io/ to the list
Conversions from TensorBoard and MLFlow https://aimstack.readthedocs.io/en/latest/quick_start/convert_data.html