dvc.org
dvc.org copied to clipboard
Introduce git helpers?
For #2856 , should we introduce some of the existing options to reduce friction with Git? In particular:
-
core.autostage
: staging changes to DVC files. -
install
: install hooks for checkout, commit, and push.
These options might be too in-depth for getting started, but they also seem hard to find today despite being useful for many (maybe most) users to reduce the pain of duplicating DVC and Git commands.
I can confirm from my own experience that core.autostage
is very usefull and not that easy to discover.
That's a good observation. We can have a single sentence âšī¸ quotation in the data management trail.
If there's a way to link to these and other similar remarks (e.g. https://github.com/iterative/dvc.org/issues/2970) from a single place like a note or hidden section then OK.
BTW I personally never use those Git helpers. a) I have git
aliases and they're muscle memory at this point; b) I like manually keeping control over commits (sometimes rebase, etc.)
How do we determine if most users will probably want to know this from the get-go?
It's about lowering the barrier to entry. It doesn't have to be in this get started guide, but it should be mentioned somewhere outside the cmd ref. I think this is reasonable even if it's not wanted by most users.
If we somehow determine most users want this from the get-go, then the better solution might be to make them default behavior in the core repo (at least for autostage
), which has been proposed in the past.
I see. So you are suggesting to use and assume Git helpers throughout this trail instead of repeating git
commands?
Not to get too philosophical but do we consider Git as a barrier to entry into DVC? Goes back to https://github.com/iterative/dvc/discussions/5896 đ
I see. So you are suggesting to use and assume Git helpers throughout this trail instead of repeating
git
commands?
Maybe in the future. Let's not go there yet.
Do you agree that we should at least have some way to point users to these helpers rather than having them hidden in the command reference? I don't know how users can be expected to find out about them today.
I think #72 can cover things like this? Especially if we mention about that sections in the very beginning (put a link into the User Guide tile for example in the index page)?
I think #72 can cover things like this? Especially if we mention about that sections in the very beginning (put a link into the User Guide tile for example in the index page)?
Sure, that could be an option.
I updated the title of the issue to remove "start" so that it's left open where this goes.
Do you agree that we should at least have some way to point users to these helpers
It's a great question but we have the same problem with many small features, probably. You see it in support channels all the time when we suggest these features and users go "ahhhh I didn't know that existed"...
p.s. sounds like Best Practices is def. a good candidate for 2021Q1 roadmap.
I updated the title of the issue to remove "start" so that it's left open where this goes
Side issue: it's still labeled content/start
though. We don't have a label for ambiguous/multiple/unknown content needed cc @iesahin
Side issue: it's still labeled content/start though. We don't have a label for ambiguous/multiple/unknown content needed cc @iesahin
Should we? :)
It feels these tidbits of information may be put into "Remarks, Recommendations, Side Notes" etc. section in a rather "silent" (or hidden) box at the end of GS or UG documents.
Our current "hidden" boxes are a bit glaring, they are more emphasized than the section titles. I think they could be a dull grey or we can have a separate kind of hidden sections for these minor info.
A few more thoughts on this:
- Git helpers are different than features that are specific to narrow use cases because the Git helpers are relevant (whether people choose to use them or not) to any use case except
--no-scm
ones. - If we have similar problems with other features, let's figure out what to do with this one and we can consider a similar approach for those?
- On second thought, I'm not sure if #72 is a great option. None of the other best practices are solely pointing out existing DVC commands, and it doesn't seem like we necessarily think these Git helpers are a best practice.
What do you think about putting it into the "How To" section as a new subpage titled something like "Reduce DVC/Git Duplication"?
If we have similar problems with other features, let's figure out what to do with this one
My point was mainly that we can't include everything in the Get Started. It would be nice if there were mentions to everything related to the features we do cover but it would end up being noise. Anyway, sounds like we agree it can be elsewhere.
We currently have this info in the install
cmd ref. and linked from the other related references (which are linked extensively from GS and guides so that's one indirect route for users to find it). Except from add
- that ref's examples may need more git
commands BTW (and possibly a link to install
)
Between Best Practices and How-Tos I incline for the former more (the latter should address specific questions/problems users tend to have) but really maybe we need a whole new User Guide section for data/model management (including versioning)... Again currently all those explanations are in the cmd ref (add
, commit
, checkout
, push/pull
). rel https://github.com/iterative/dvc.org/issues/144
We could use some hover notes that are shown across the documentation. We can create them like <hover>git-helpers</hover>
and these boxes tell these optional-but-not-necessarily-recommended features.
Otherwise, IMHO, linking to another page distracts the user, telling them in the main text breaks the narrative, putting them in a hidden section makes these unnecessarily valuable :) These can be thought as "common footnotes across the site"
I think we're overcomplicating this.
The easiest first step would be to find a place to link to install
from the current https://dvc.org/doc/start/data-and-model-versioning and submit a PR proposal. And to mention core.autostage
in https://dvc.org/doc/command-reference/install + link to the config
ref.
To summarize the larger question, is it whether to keep featuring a bunch of git add/commit
samples in our docs (especially Get Started/ Data Management) or to introduce/assume dvc install
more?
Another point that hover
functionality may prove useful is issues like #2970 . We can put little boxes on options like --local
, that tell "this is optional, we use it for such and such."
Related: https://github.com/iterative/dvc/discussions/6929.
And to summarize the larger question, is it whether to keep featuring a bunch of
git add/commit
samples in our docs (especially Get Started/ Data Management) or to introduce/assumedvc install
more?
That wasn't initially my question, but it's a good question. First, it's probably more applicable to autostage
than dvc install
. dvc install
reduces DVC commands, not Git commands.
Even for autostage
, I wouldn't rush to do this. It would be a big change and one that might not make sense for users who miss that part or who started using DVC before we made the change. Also, glancing through the examples, I think the Git commands are mostly helpful by showing when .dvc
or .gitignore
files were modified.
When I mentioned lowering the barrier to entry, it could be by eventually eliminating these Git commands and not bothering to explain .dvc
files and the details of how DVC works at first, but I'm not so sure I even like that idea, and I don't think we should go that far now.