git.base.IndexFile.iter_blobs crashes after executing `git update-index --skip-worktree <some_file>` - unsupported git index version
Hi.
I've encountered AssertionError while calling git.base.IndexFile.iter_blobs after git update-index --skip-worktree <some_file> like below.
%pip install GitPython==3.1.11
Requirement already satisfied: GitPython==3.1.11 in /usr/local/python3.7/lib/python3.7/site-packages (3.1.11)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/python3.7/lib/python3.7/site-packages (from GitPython==3.1.11) (4.0.5)
Requirement already satisfied: smmap<4,>=3.0.1 in /usr/local/python3.7/lib/python3.7/site-packages (from gitdb<5,>=4.0.1->GitPython==3.1.11) (3.0.4)
%cd $(mktemp -d)
%git init .
Initialized empty Git repository in /tmp/tmp.JXLEk4zfdj/.git/
%touch foo
%git add foo
%git update-index --skip-worktree foo
%python3 -c '
import git
with git.Repo(".", search_parent_directories=True) as repo:
print(*repo.index.iter_blobs())
'
Traceback (most recent call last):
File "<string>", line 4, in <module>
File "/usr/local/python3.7/lib/python3.7/site-packages/git/index/base.py", line 441, in iter_blobs
for entry in self.entries.values():
File "/usr/local/python3.7/lib/python3.7/site-packages/gitdb/util.py", line 253, in __getattr__
self._set_cache_(attr)
File "/usr/local/python3.7/lib/python3.7/site-packages/git/index/base.py", line 128, in _set_cache_
self._deserialize(stream)
File "/usr/local/python3.7/lib/python3.7/site-packages/git/index/base.py", line 157, in _deserialize
self.version, self.entries, self._extension_data, _conten_sha = read_cache(stream)
File "/usr/local/python3.7/lib/python3.7/site-packages/git/index/fun.py", line 185, in read_cache
version, num_entries = read_header(stream)
File "/usr/local/python3.7/lib/python3.7/site-packages/git/index/fun.py", line 165, in read_header
assert version in (1, 2)
AssertionError
It's not an urgent issue for me because after git update-index --no-skip-worktree it works without any problems, but if you have spare time can you please support git index with skip-worktree?
FYI, if I run python with -O flag, above example seems to correctly work.
It seems like --skip-worktree upgrades the index file to a more recent version which is not yet supported by GitPython, hence the assertion error.
Luckily, disabling this feature with --no-skip-worktree downgrades the index version, and quite a mindful thing to do by git.
Using the -O flag in Python appears to downright ignore assertions as they are considered a debug feature. Altering a git index that was written in a more recent format could cause all kinds of trouble, so implementing proper support would certainly be preferred.
I am leaving this issue open as it might be worth investigating if GitPython does indeed silently corrupt index files written with a newer version, or whether it somehow manages to not discard any data that it previously decoded. If so, the assertion could possibly be relaxed.
I just hit this issue while update vim plugins for nixpkgs. The repo seems to be fine after the assertion failure and I would really like to get a patch for this rather sooner than later.
This can also be triggered via git add --intent-to-add <path> (aka git add -N <path>). I'm surprised that doesn't burn more folks, as git add -N is quite handy.
I just ran into this while working on the Fedora git package. The package maintainer tooling (rpkg/fedpkg) uses GitPython.
Yes, indeed GitPython cannot read V3 index files which add support for extended attributes. The 'intend-to-add' flag is the second of the bunch.
Adding support for it is definitely doable as reading such extended indices involves checking for a bit to trigger reading another two bytes, which are then added to the decoded flags.
I am using git 2.41.0. --no-skip-worktree did not work for me.
This did though:
git update-index --index-version=2
Somewhat related, https://github.com/git/git/commit/9213563f0a adds a --show-index-version option to update-index and adjusts the documentation to suggest that index version 4 "should be considered mature technology these days." That's only in seen at the moment, so it will likely appear in git-2.43.0, scheduled for a late November release.
That may result in more tools using features from and/or expecting support for index version 4 in the not-too-distant future.
I would love to be able to submit patches to support index version 3 and 4, but I'm unlikely to muster the amount of time and/or skill needed to do that anytime soon.
I was about to suggest to look at git-dulwich for answers, but they don't seem to support V4 either. With that said, gitoxide has an implementation for all versions and all extensions, and could serve as example. V4, once read, is absolutely the same in memory, and V4 only affects how paths are decoded - these are now deltified.