dvclive icon indicating copy to clipboard operation
dvclive copied to clipboard

`html`: How to track/ignore it in DVC

Open dberenbaum opened this issue 2 years ago • 6 comments

After https://github.com/iterative/dvc.org/pull/3411, the easiest way to track plots in dvc.yaml is:

    plots:
      - dvclive

However, this will track dvclive/report.html, which is generally not desired. Some options to handle this better:

  • Automatically add dvclive/report.html to .dvcignore (should we check for the presence of DVC or just add it anyway?).
  • Move report.html outside of the dvclive dir.
  • Allow DVC to track it -- is it harmful?

dberenbaum avatar Apr 18 '22 19:04 dberenbaum

One more option would be to always suggest using cache: false for plots and capturing all dvclive outputs in git instead of dvc.

dberenbaum avatar Apr 18 '22 19:04 dberenbaum

Automatically add dvclive/report.html to .dvcignore (should we check for the presence of DVC or just add it anyway?).

I won't really like to add DVC-specific logic.

Move report.html outside of the dvclive dir.

Not sure if it's a good idea. General feedback is that DVC/DVCLive already generates too many files at the root of the repo.

Allow DVC to track it -- is it harmful?

I might be missing something but it would be just wasted storage.


What if we encourage through docs to use set outputs for each data subfolder?

This way the report won't be tracked by DVC and allows for more flexibility. The most common setup, IMO, would be:

    plots:
      - dvclive/images
      - dvclive/scalars:
          cache: false

daavoo avatar Apr 19 '22 08:04 daavoo

What if we encourage through docs to use set outputs for each data subfolder?

This way the report won't be tracked by DVC and allows for more flexibility. The most common setup, IMO, would be:

    plots:
      - dvclive/images
      - dvclive/scalars:
          cache: false

I think this makes sense for now. It feels a little burdensome to have to add all this to get dvclive outputs tracked properly, but at least it's explicit and flexible.

dberenbaum avatar Apr 21 '22 20:04 dberenbaum

Would https://github.com/iterative/dvclive/pull/322 close this? Given that plots will be isolated in it's own folder, the following won't track the report:

plots:
  - dvclive/plots

daavoo avatar Oct 05 '22 14:10 daavoo

🤔 It's definitely better. If we continue towards trying to auto-configure all DVCLive outputs in dvc.yaml, we would still want a way to specify the HTML as ignored, though, right?

dberenbaum avatar Oct 05 '22 15:10 dberenbaum

If we are writing dvc.yaml from dvclive, is there any downside to adding the report file to dvcignore?

dberenbaum avatar Oct 05 '22 15:10 dberenbaum