`dvc studio login`: setup auto pushing experiments
See ~#5029~ (edit: https://github.com/iterative/dvc.org/issues/5029) and the related issues linked there for background.
Rather than document the environment variables to auto push experiments, we could make this part of the studio login workflow since auto-pushing experiments is mostly useful when using studio rather than keeping experiments local. We would need to:
- Make config options like
exp.auto_pushandexp.git_remote - During studio login, ask to set these options. The UI could look something like this:
$ dvc studio login
...
Authentication successful. The token will be available as risen-geum in Studio profile.
Do you want to push experiments automatically when they are completed [Y\n]?
Enter the Git remote to use [origin]:
Since this is a studio login, I'd rather not have any prompts and enable everything by default. But we should notify users that they are enabled and provide hints to disable (and support arg to disable this behaviour).
cc @iterative/vs-code since this should also impact the vs code flow
@skshetry Do you mean to enable them during studio login or some other time? Auto-pushing is pretty connected to the Studio workflow since it's where the pushed experiments appear, and I don't think it's worthwhile to auto-push them without Studio.
Regardless, I think we can do the first step of adding config options. Having to set environment variables every time to auto push doesn't make much sense.
Do you mean to enable them during studio login or some other time?
Enabling them automatically during studio login (unless it's not disabled already by other means).
Not that strong an opinion, but gh auth login has prompts. While they can be clumsy, in this case there is already some interaction needed, so I didn't think prompts would be bad UX. What's your concern?
From a new user perspective, it might be confusing and unclear what to choose. "Do you want to push experiments?" - maybe, maybe not, idk. What's experiments? etc.
It'll definitely lead to choice paralysis to me if I was using it for the first time. 😅
It's better to make a choice for them here. But the message should be clear that we are doing that. We want to have less interactions as possible, less decisions for user to make as possible.
We also need a way to auto push on exp save for dvclive-only experiments. DVC_EXP_AUTO_PUSH does not do this now.
Thoughts on this approach?
- Once you login to studio, everything will be pushed automatically unless you set it to offline, and we can make clear during login how to toggle offline mode
- We can show a notification before starting the push making clear that if you don't want to wait, it's safe to cancel and you can always upload later with exp push
Not a requirement but nice to have would be to incorporate #8843 when doing this. If we can push the dvc-tracked data at the end of each stage, and include the run cache, it can help in scenarios like recovery from failed runners but also break up the pushes during the experiment run so the final push may not feel so painful.
Tasks for this issue:
- [x] Confirm
DVC_EXP_AUTO_PUSHworks as expected - [x] Make
DVC_EXP_AUTO_PUSHdefault to use git remoteorigin(currently requiresDVC_EXP_GIT_REMOTE) - [x] Make
DVC_EXP_AUTO_PUSHwork ondvc exp save - [x] Add config options for
dvc config exp.auto_pushanddvc config exp.git_remote - [x] Handle errors if no dvc or git remote
- [x] Enable during
dvc studio loginwith instructions or option to opt out - [x] During push, show useful messages in case it's slow (it's safe to cancel, how to upload later, how to disable push)
- [x] Handle case where remote doesn't exist
- [ ] Simplify ways to set git remote url
- [x] Make auto push work with queue
Out of scope:
- [ ] #8843
@skshetry I updated the checklist above for what's left to do here.
@dberenbaum, any thoughts on how to simplify?
We could also make
studio.repo_urlan alias forexp.git_remoteand deprecate it, so you can specify either a URL or a git remote name.
Originally posted by @skshetry in https://github.com/iterative/dvc.org/pull/5165#discussion_r1512677465
@skshetry This suggestion makes sense to me.
Added Make auto push work with queue. Currently, queued experiments fail because origin is not set in the queued repo:
$ dvc exp run --run-all
Following logs for all queued experiments. Use Ctrl+C to stop following logs (experiment execution will continue).
Reproducing experiment 'sober-daze'
Running stage 'train':
> python src/stages/train.py --config=params.yaml
WARNING: Failed to validate remotes. Disabling auto push: 'origin' is not a valid Git remote or URL
Ran experiment(s):
To apply the results of an experiment to your workspace run:
dvc exp apply <exp>