pip-tools icon indicating copy to clipboard operation
pip-tools copied to clipboard

pip-compile as pre-commit hook corrupts git worktree index under certain circumstances

Open gimbo opened this issue 4 years ago • 2 comments

:facepalm: This one's a bit complex, so apologies for the long report — but I'm trying to lay out everything as clearly as I can. If you want to get right down into it, just skip ahead to "Steps to replicate" where you can see code & output demonstrating the issue.

Short version

It seems that when the pip-compile pre-commit hook runs during a git commit on a git worktree folder, if one of the requirements is specified by a URL (e.g. and in particular a github URL), then the worktree's index file is over-written by that of the URL-specified package, which aborts the git commit and renders the worktree folder unusable under git (until fixed somehow).

I'm guessing that pip-compile is cloning the repo specified by the URL while trying to compute whether there are any changes, and that under the special case of a git worktree something goes wrong so that the cloned repo's index gets written in place of the worktree's index. This only seems to happen in a worktree (not the 'base' git repo folder), and only when the hook runs as part of an actual commit.

Detailed version

Laying it out with more detail/precision/examples... (Again, maybe just skip ahead to "Steps to replicate" if you prefer to read code/output.)

Suppose:

  • We have some python codebase codebase_x which uses pip-tools to maintain its dependencies via requirements.{in,txt} files.

  • One of codebase_x's dependencies in its requirements.in (let's call it package_y) is specified by a URL of a git repo — e.g. git+https://github.com/.../package_y@main#egg=package_y.

  • codebase_x is also under git version control, and it uses the pip-compile pre-commit hook as described in the pip-tools README.rst to keep requirements.txt in sync with requirements.in

  • We have codebase_x checked out in some folder, e.g. ~/codebase_x.

  • We have a git worktree created from that folder, e.g. in ~/worktree_dir.

    (So ~/worktree_dir/.git is just a file pointing to ~/codebase_x/.git/worktrees/worktree_dir/, which contains the worktree's actual index, HEAD, etc.)

Then, while working in ~/worktree_dir, if we add some other dependency to the requirements.in file:

  • Running pre-commit run --all-files causes the pip-compile pre-commit hook to pick up the dependency and updates requirements.txt as we'd expect; pre-commit reports "Failed", but it's done what we wanted: updated requirements.txt keeping it in sync with requirements.in.

But then:

  • If we try to commit the changed requirements.{in,txt} files, when the pip-compile pre-commit hook runs again it passes (as it should: no changes to compute), but it leaves the worktree folder's git index file corrupted — specifically (given the example above), it seems to overwrite ~/codebase_x/.git/worktrees/worktree_dir/index with (maybe a slightly modified version of) the index file from package_y's git repo.

This renders the ~/worktree_dir worktree folder unusable under git.

(It seems possible to remedy this by copying a clean index from somewhere appropriate, e.g. ~/codebase_x/.git/index works fine provided worktree_dir was up to date with it before we tried this commit. I'm not 100% certain this fixes everything, but it seems to work OK then.)

Steps to replicate

I've set up an example repo and a script in a gist to hopefully make replication easy. The following should suffice, provided python runs python > 3.9 in your environment:

git clone https://gist.github.com/c2d43c9ff2f63771c6d1086422311b2b.git bug_demo
cd bug_demo
sh ./bug_demo.sh

The example repo (pip_compile_worktree_bug_demo) has one dependency (more-itertools, chosen because it has no dependencies of its own), which is specified by a URL — see its requirements.in.

The bug_demo.sh script (in the gist):

  • Clones pip_compile_worktree_bug_demo.
  • Creates a sibling git worktree folder.
  • Activates and populates a venv in that worktree.
  • Adds pipdeptree as an ordinary dependency to requirements.in (again, no transitive dependencies, for simplicity).
  • Runs pip-compile, via pre-commit run --all-files, which updates requirements.txt accordingly.
  • Tries to commit the modified requirements.* files.
    • That commit attempt triggers another pre-commit run which corrupts the index file.
  • Demonstrates the corruption by:
    • Listing the filenames in the worktree's index, which now shows filenames from more-itertools, which it absolutely should not.
    • Trying to run git status in the worktree, which also fails (as the index refers to a non-existent file, i.e. something from more-itertools).

Expected result

The expected result when running that script would be:

  • The git commit would succeed.
  • The worktree's index would be updated correctly with the changes (rather than being overwritten)
  • The second git ls-files -s in the script would show files from the original repo, not from more-itertools.
  • git status would still work.

Actual result

See the example run in the gist, which shows the broken behaviour detailed just above.

In particular, note:

  • Line 108: "Failed" pre-commit run --all-files where the pip-compile hook has updated requirements.txt correctly — this is as we'd expect.
  • Line 171: before the git commit call, git ls-files -s shows the correct filenames in the worktree's index — this is as we'd expect.
  • Line 187: "Passed" pre-commit run as part of the git commit, as requirements.txt was already up to date with requirements.in — this is as we'd expect.
  • Line 244: here the git commit call fails, because the pip-compile pre-commit hook (verbose output just above this line) has apparently overwritten the worktree's index, confusing the hell out of git.
  • Line 258: after that, git ls-files-s shows the wrong filenames in the worktree's index — it shows filenames from the more-itertools repo!
  • (git status then fails)

(The call topre-commit run --all-files before the git commit is not, of course, necessary — the bug is also triggered without it. But I'm including to show that the bug occurs even if the pip-compile pre-commit hook didn't need to change requirements.txt.)

Environment Versions

  1. macOS 12.2.1
  2. Python version: `Python 3.9.9
  3. pip version: pip 22.0.4
  4. pip-tools version: pip-compile, version 6.5.1
  5. pre-commit version: pre-commit 2.17.0

gimbo avatar Mar 10 '22 20:03 gimbo

For now I've worked around this by reverting to using MartinThoma/check-pip-compile.

gimbo avatar Mar 29 '22 17:03 gimbo

this likely needs something similar to https://github.com/pre-commit/pre-commit/blob/3cdc6c9d81149ab1f74b05c17743f988cb3af655/pre_commit/git.py#L27-L48

asottile avatar May 10 '22 17:05 asottile