Cache dvclive directory
Background from https://github.com/iterative/example-repos-dev/issues/249:
do
-o evalfor the entire dir
I increasingly feel this would be the best default for dvclive, both with and without pipelines. It requires a DVC remote, but I think it's needed in almost every real scenario, and Git-only workflows are likely not helping with onboarding since I have found most people are confused by the idea of tracking dvclive with Git. Caching all of dvclive simplifies the workflow. Without pipelines, it's easy to log any size files and know that everything logged by dvclive is saved in the same place. With pipelines, it makes the stage definition simple since you just add dvclive as a stage output.
Originally posted by @dberenbaum in https://github.com/iterative/example-repos-dev/issues/249#issuecomment-1707106231
Following #687 and dvclive 3.0 release, it will be possible to dvc add the entire dvclive directory. Steps in this direction can be:
- Make the example repos track the entire directory with DVC.
- Document this as the recommended pattern.
- Have an option to do this automatically in dvclive.
- Make it the default behavior in dvclive.
Let's start with 1 and we should start to see if it feels like an improvement, in which case we can move onto the rest of the list. If we find it becomes the recommended pattern, we can make it a default in dvclive 4.0 (similar to the progression of save_dvc_exp).
@shcheklein Let's collect here the issues that we are seeing with this approach so it's easier to evaluate. So far I see:
TBH it's already enough to at least give me pause. Ideally, I would probably cache plots and artifacts and leave metrics.json and params.yaml to Git, but this feels complicated.
Lack of caching it should be solved by @skshetry already, AFAIU?
Remote access
I think I I agree with you that it's worth it though still. It makes the management of the directory easier.
Badges
Seems minor? (I wish we had them, may be just make a copy of the metrics file in some way)
but this feels complicated.
What do you see here as the main complication? People annoyed by the presence of the eval or dvclive dir? Something else?
What do you see here as the main complication? People annoyed by the presence of the
evalordvclivedir? Something else?
My main concerns are:
- Having to explain that different outputs get saved to different places and explaining how to configure this. One of the main benefits of this approach was keeping everything in one place and simplifying the pipeline to a single dvclive output.
- Remote access. Something like
dvc exp showor the VS Code table will fail to show old commits if remote access isn't available, for example.
Feedback from https://iterativeai.slack.com/archives/C01SR9Q12LB/p1695337625637059:
is logging nicely in studio with dvclive and I am able to monitor the training curves however after running dvc exp push- itβs due to missing remote - we delete live metrics but metrics from remote are not available yet., Not a smooth experience.
A point in favor of caching the dvclive directory by default: transitioning to caching it later (for example, as a pipeline output) requires running git rm --cached ... to drop it from Git first, introducing potential friction.