dvclive
dvclive copied to clipboard
plots: make output structure consistent between plot types
Follow up to #322.
The difference in behavior between log_image and log_sklearn_plot bothers me:
├── plots
│ ├── images
│ │ ├── 0
│ │ │ └── img.png
│ │ └── 1
│ │ └── img.png
│ └── sklearn
│ └── confusion_matrix.json
Now that they both get saved under plots, it seems more odd that they have such different behavior and output structure. images create a new folder for each step, while sklearn throws an exception when used with multiple steps. #271 will only make this more complicated.
For 1.0, dvclive should be more consistently opinionated. Since the focus has shifted a bit from serverless live tracking to more of a convenience for dvc logging, I would vote to overwrite plots at each step since this is how tools like dvc plots diff, vs code, and studio expect them (this was my mistake pushing for the nested structure originally). That would mean the output would look like:
├── plots
│ ├── images
│ │ └── img.png # flatten and overwrite at each step.
│ └── sklearn
│ └── confusion_matrix.json # enable overwriting at each step.
I think it makes sense as default behavior, but doesn't really fix the use case of wanting to visualize image updates across epochs (https://github.com/iterative/vscode-dvc/issues/1640).
A nested structure makes sense for that use case and anyhow if users want that structure they can manually generate it (and we can document it) as follows:
live.log_image(f"{live.get_step()}/img.png", img)
live.log_image(f"{live.get_step()}/img.png", img)
What about live.log_image(f"img/{live.get_step()}.png", img)? That gives me all steps in a single dir, which is easier for me to browse and easier for us to handle in VS Code, Studio, etc.