dvclive Cache dvclive directory

Background from https://github.com/iterative/example-repos-dev/issues/249:

do -o eval for the entire dir

I increasingly feel this would be the best default for dvclive, both with and without pipelines. It requires a DVC remote, but I think it's needed in almost every real scenario, and Git-only workflows are likely not helping with onboarding since I have found most people are confused by the idea of tracking dvclive with Git. Caching all of dvclive simplifies the workflow. Without pipelines, it's easy to log any size files and know that everything logged by dvclive is saved in the same place. With pipelines, it makes the stage definition simple since you just add dvclive as a stage output.

Originally posted by @dberenbaum in https://github.com/iterative/example-repos-dev/issues/249#issuecomment-1707106231

Following #687 and dvclive 3.0 release, it will be possible to dvc add the entire dvclive directory. Steps in this direction can be:

Make the example repos track the entire directory with DVC.
Document this as the recommended pattern.
Have an option to do this automatically in dvclive.
Make it the default behavior in dvclive.

Let's start with 1 and we should start to see if it feels like an improvement, in which case we can move onto the rest of the list. If we find it becomes the recommended pattern, we can make it a default in dvclive 4.0 (similar to the progression of save_dvc_exp).

Sep 06 '23 15:09 dberenbaum

@shcheklein Let's collect here the issues that we are seeing with this approach so it's easier to evaluate. So far I see:

Sep 11 '23 17:09 dberenbaum

TBH it's already enough to at least give me pause. Ideally, I would probably cache plots and artifacts and leave metrics.json and params.yaml to Git, but this feels complicated.

Sep 11 '23 17:09 dberenbaum

Lack of caching it should be solved by @skshetry already, AFAIU?

Remote access

I think I I agree with you that it's worth it though still. It makes the management of the directory easier.

Badges

Seems minor? (I wish we had them, may be just make a copy of the metrics file in some way)

but this feels complicated.

What do you see here as the main complication? People annoyed by the presence of the eval or dvclive dir? Something else?

Sep 11 '23 18:09 shcheklein

What do you see here as the main complication? People annoyed by the presence of the eval or dvclive dir? Something else?

My main concerns are:

Having to explain that different outputs get saved to different places and explaining how to configure this. One of the main benefits of this approach was keeping everything in one place and simplifying the pipeline to a single dvclive output.
Remote access. Something like dvc exp show or the VS Code table will fail to show old commits if remote access isn't available, for example.

Sep 12 '23 15:09 dberenbaum

Feedback from https://iterativeai.slack.com/archives/C01SR9Q12LB/p1695337625637059:

is logging nicely in studio with dvclive and I am able to monitor the training curves however after running dvc exp push - it’s due to missing remote - we delete live metrics but metrics from remote are not available yet., Not a smooth experience.

Sep 22 '23 14:09 dberenbaum

A point in favor of caching the dvclive directory by default: transitioning to caching it later (for example, as a pipeline output) requires running git rm --cached ... to drop it from Git first, introducing potential friction.

Sep 28 '23 17:09 dberenbaum