git-novice
git-novice copied to clipboard
This description of the VCS concept is misleading for Git
This description of the version control concept is misleading for Git. Since this is specifically in the git-novice lesson, I think it should be changed.
In the case of Git, rather than how Subversion works, when you commit, Git stores entire copies of any (staged) files that were changed. It does not recreate versions by applying sequences of diffs. This is explained here.
There is some related discussion here.
The point is that this episode is more an introduction to version control systems rather than to Git. (I would be ok to change the second objective to reflect that). The way Git stores things internally is pretty fascinating, but at this level maybe it's an implementation detail. We recommend instructors to read and know how Git works (see the instructor's guide), but for learners might be confusing.
I think that trying to explain what is really happening would either complicate things to much for some learners, or would take much more time.
I'm closing this for now, but feel free to reopen if you can think of a modification that can both be more technically accurate without adding much time or extra complications to the lesson.
I'm also opening a new issue with the change of the second objective.
Thanks for considering this.
I do think this issue should still be open; but I don't see that I have an option to re-open it. How do I re-open the issue?
I agree entirely about keeping things simple; and, as you suggest, I see no need to teach the graph model discussed in #263. The good news is that we can teach a more accurate mental model and, at least to me, it's even more simple than the diff mental model we're teaching. Maybe if you've been using Subversion for years, then it's hard to let go of this idea of applying a sequence of diffs to recover different versions. But being naive, it's much easier to picture storing copies of any files that are changed. This is actually closer to what we do manually without version control systems.
Also for reference, in this later episode , it would help to already have a correct mental model for how git stores changes, especially when we say "When we run git commit, Git takes everything we have told it to save by using git add and stores a copy permanently inside the special .git directory." It was actually when I was reviewing this lesson for the first time that I looked up how git stores changes, since I had been assuming the Subversion diff concept.
Thoughts?
Ok, we can reopen and let others comment.
Most (if not all) of our learners are completly new to version control systems, so more than thinking in terms of Subversion, a more accurate picture would be models like Word autosave or just renaming files with version names as shown in the comic in the first episode.
We define a commit as a set of changes and a changeset is a common concept in version control. This makes easy to think in terms of diffs. Thinking in terms of diffs is also compatible with the interface, e.g. when we use diff
, talk about three way merges, or more advanced things like git revert
. I can't think of a case when as a (novice) user you need to know how things are actually stored and that's why I said this is an implementation detail.
You made a good point of saying storing copies of the files is closer to what one would do manually. I just don't see a good way of including it at this stage of the lesson.
This is something we could try:
- leave this first chapter as it is, but including #523 so it's broader and more accurate,
- when talking about changes (episode 4), we include a callout box or a new short paragraph explaining the more accurate mental model, noting that's is mostly an implementation detail, but worthy to know.
We could reference the Git parable, which a more accurate picture using non-technical language.
I changed the label to discussion to gather other opinions before deciding.
This issue and #523 are connected, so apologies if I'm cross-commenting:(
I think this lesson doesn't need to use the "change" model - its not necessary at this level. To introduce the basic functions of a version control system, the model used in the Git parable is a good, tangible model that's most relevant to the git learners. They're typically people who are working on some type of source code, or written text, etc, which naturally lend themselves to a "snapshot" model like the parable's.
The "snapshot" type of model also makes clear the differences between a vcs and a word processor (track changes, e.g., or perhaps auto-save functionality).
We should also add a strong suggestion that for learners that backup their machines: make sure that your repo folder (and subfolders) and/or working dir are backed up. Kind of off-topic, but important IMHO :)
Did merged PR #715 actually solve this issue? If so, could the issue be closed?