dvc
dvc copied to clipboard
Common UI improvements
Goals for UI:
- Abstract UI implementation to apply high-level functions/classes to all UI components
- [x] #5347 and #5348
-
Establish CLI/UI guidelines
-
Improve formatting, style, and functionality
@skshetry @efiop @pmrowla
Referenced in https://github.com/iterative/dvc/pull/5282#discussion_r570719275: Subcommand should usually be plural
What about extracting all user messages to a central location for easy access to the prod team to help with? Could even help for future i18n (more context).
See https://github.com/iterative/dvc/discussions/5891 for --quiet
guidelines
@skshetry Are there other issues that can be linked here?
Some updates on this - We decided to adopt https://clig.dev as a base guideline for the CLI applications (I would suggest everyone read this). I started going through our commands and fix where we violate this and make some improvements along the way.
I had to start somewhere, so I have chosen data management commands - starting with add
. I noticed some performance and reliability issues on add
: #6227, #6226, and #6228, which I'll be going to fix, but before I need to do some cleanups.
After that, I'll focus on improving data management commands (add/commit/remove/move/checkout/destroy
).
If something bugs you or is odd, please open an issue and assign me. I'll try to fix them as possible.
Another thing that I'd like to open the discussion is the UI messages for the add
.
- The
add
command is quiet about what is done, at the end, there's only a progress bar (and, an ugly hint). What should it say on success? - Do we even need a progress bar for a single target to add (eg:
dvc add data
)? - Do we need progress bars at all, even for multiple targets?
- How should DVC show progress bars/status on places where it might take a lot of time? eg: recursively adding a large directory, collecting lots of stages, etc?
To compare (and, spark the discussion, here are the asciinema recordings for multiple targets add and single targets add).
Single target
Multiple targets
Another guide for reference: https://primer.style/cli/
We decided to adopt https://clig.dev as a base guideline for the CLI applications
p.s. would it be worth giving a quick talk to others on this? E.g. in an all hands meeting or in a bash perhaps (cc @casperdcl)
The add command is quiet about what is done, at the end, there's only a progress bar (and, an ugly hint).
Not sure why you consider the progress bar to be quiet @skshetry . I think it communicates what's happening clearly enough (generally speaking).
💡 The exact phrases used in all messages can always be improved and to that effect I propose to consider an i18n-like effort to refactor the code extracting message strings to a separate module or config file anyone can easily propose updates on.
Idk what to think about the Git hints. I give them for granted at this point but thinking about it, I'm not sure. Are they useful realistically? Or should people know what to do after dvc add
? Maye changing that to a more explicit message on what happened would be better e.g. 'data.csv' was cached. './data.csv.dvc' was created.
so the user can decide what to do with the results.
Do we need progress bars at all
Yes. These operations can take a long time. It's important to know that there's progress.
How should DVC show progress bars/status on places where it might take a lot of time?
I see some problems with the current way (shown in the 2nd animation above):
- The main progress bar starts at
0/2
. Why 0? And it never prints the namedata
(first dir given) - "computing hashes" states it's
only done once
but it seems to happen more than once (once per target I guess?). - The 2nd top bar jumps from 0 to
2/2
and prints the namedata2
(2nd dir given).
In all it is a bit confusing to try interpret those bars. Can there be a single, persistent master bar on top and then smaller bars for each subprocess underneath?
I give them for granted at this point but thinking about it, I'm not sure. Are they useful realistically? Or should people know what to do after
dvc add
?
They are misleading. I think silence is better than providing an incorrect command. (Some of these files are .gitignore
d, see the difference between proposed git add
and git status
)
I believe building the command verbose doesn't add much value anyway. For the first few times, they may lead the user but after that, they are ignored. Verbosity must be optional, or at least could be turned off using config.
Even notoriously difficult tools like ffmpeg
don't lecture about the usage as much as DVC. I don't feel it's the right attitude towards the users, when we tell "you should do git add
after this command", we are signaling "you don't know how to use this tool and we don't trust you." This may be true for some new users, but for veterans these messages are a nuisance I believe.
Or should people know what to do after
dvc add
?
I am hesitant to remove this on dvc add
as I have seen a lot of users just copy and paste that. I don't expect everyone using DVC to understand git. They might not get very far, but for "Data Versioning" in DVC, it's enough. And, git add
hint on dvc add
is very stable. I do have plans for suppressing those hints messages.
For experiments and pipelines, I don't have strong opinions.
Update on UI improvement:
- Index was introduced in #6300.
- In
dvc add
, some progress bars were merged and reduced. A few spinners during stage collection and graph checks were added. We still need to fix how we save files to cache which we decided on https://github.com/iterative/dvc/discussions/6252. Also see https://github.com/iterative/dvc/issues/6387. -
dvc add
duplicated targets are now ignored, instead of failing with ambiguous error. - Updater was changed to only show on
dvc version
, thanks to @efiop. - Updater message was changed from in-box to
pip
like output. - Introduced DVC_ANALYTICS env var thanks to @efiop.
- fsspec callback was introduced by @isidentical and
push
/pull
transfers were unified by @pmrowla, making it possible for us to have a single progress bar: https://github.com/iterative/dvc/issues/3682.
The #3682 is now being worked on. 🙂
@skshetry Do you think we can close this and open any more specific UI issues, or is it helpful to keep this open?
Closing as stale. Let's open more specific follow-ups as needed.