hatch icon indicating copy to clipboard operation
hatch copied to clipboard

`.gitignore` entries are ignored when the root path matched one of the entries

Open mgorny opened this issue 8 months ago • 1 comments

While installing nbclassic-1.3.0 from a source distribution, different files would be installed to the system, depending on the directory to which the distribution was unpacked. After some investigation in https://github.com/jupyter/nbclassic/issues/336, we've determine that Hatchling does not respect (the remaining) .gitignore patterns if the project root matched a single .gitignore entry.

Please consider the following reproducer:

cat > pyproject.toml <<-EOF
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "foo"
version = "0"

[tool.hatch.build.targets.sdist]
artifacts = ["foo/data/test.txt"]
EOF
cat > .gitignore <<-EOF
foo/data/test.txt
dist/
EOF
mkdir -p foo/data
> foo/__init__.py
> foo/data/test.txt

Building in the git repository works as expected:

$ python -m build
* Creating isolated environment: virtualenv+pip...
* Installing packages in isolated environment:
  - hatchling
* Getting build dependencies for sdist...
* Building sdist...
* Building wheel from sdist
* Creating isolated environment: virtualenv+pip...
* Installing packages in isolated environment:
  - hatchling
* Getting build dependencies for wheel...
* Building wheel...
Successfully built foo-0.tar.gz and foo-0-py2.py3-none-any.whl
$ unzip -l dist/foo-0-py2.py3-none-any.whl 
Archive:  dist/foo-0-py2.py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-02-2020 00:00   foo/__init__.py
       43  02-02-2020 00:00   foo-0.dist-info/METADATA
      105  02-02-2020 00:00   foo-0.dist-info/WHEEL
      250  02-02-2020 00:00   foo-0.dist-info/RECORD
---------                     -------
      398                     4 files

However, building inside the dist subdirectory causes the .gitignore entry for foo/data/test.txt to be ignored:

$ cd dist
$ tar -xf foo-0.tar.gz
$ cd foo-0
$ python -m build -w
* Creating isolated environment: virtualenv+pip...
* Installing packages in isolated environment:
  - hatchling
* Getting build dependencies for wheel...
* Building wheel...
Successfully built foo-0-py2.py3-none-any.whl
$ unzip -l dist/foo-0-py2.py3-none-any.whl 
Archive:  dist/foo-0-py2.py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-02-2020 00:00   foo/__init__.py
        0  02-02-2020 00:00   foo/data/test.txt
       43  02-02-2020 00:00   foo-0.dist-info/METADATA
      105  02-02-2020 00:00   foo-0.dist-info/WHEEL
      321  02-02-2020 00:00   foo-0.dist-info/RECORD
---------                     -------
      469                     5 files

Note that if I remove the entry for dist/, everything starts working again:

$ sed -i -e /dist/d .gitignore
$ python -m build -w
* Creating isolated environment: virtualenv+pip...
* Installing packages in isolated environment:
  - hatchling
* Getting build dependencies for wheel...
* Building wheel...
Successfully built foo-0-py2.py3-none-any.whl
$ unzip -l dist/foo-0-py2.py3-none-any.whl 
Archive:  dist/foo-0-py2.py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-02-2020 00:00   foo/__init__.py
       43  02-02-2020 00:00   foo-0.dist-info/METADATA
      105  02-02-2020 00:00   foo-0.dist-info/WHEEL
      250  02-02-2020 00:00   foo-0.dist-info/RECORD
---------                     -------
      398                     4 files

This is particularly problematic, because as a package author you really can't predict (nor you should try to) what directories will users use to build your package from source. Furthermore, it is hard even to notice that something went wrong, as the effect is only additional files being installed.

The problem seems to have been introduced in #1643 + #1791, CC @jameshilliard.

Reproduced with Hatchling 1.27.0, on Python 3.13.3.

mgorny avatar Apr 29 '25 09:04 mgorny

Hatchling does not respect (the remaining) .gitignore patterns if the project root matched a single .gitignore entry.

If the project root is ignored then the project is known to not be tracked by git, hence the reason for disabling .gitignore in this case.

However, building inside the dist subdirectory causes the .gitignore entry for foo/data/test.txt to be ignored:

This seems to be working as intended, we disable .gitignore logic since we know the project isn't tracked by git.

This is particularly problematic, because as a package author you really can't predict (nor you should try to) what directories will users use to build your package from source. Furthermore, it is hard even to notice that something went wrong, as the effect is only additional files being installed.

This is weird, why would the sdist(which would normally be generated from a valid git tree and thus should already have ignored files pruned based on the .gitignore rules) contain files that are not supposed to be installed?

I think you're looking at the issue wrong, the bug here seems to be the files being in the sdist in the first place. The sdist project files should not require further filtering in general AFAIU.

The problem seems to have been introduced in https://github.com/pypa/hatch/pull/1643 + https://github.com/pypa/hatch/pull/1791, CC @jameshilliard.

I'm not sure how this issue would have been introduce there, prior to those PR's if the project root was ignored you would have a completely broken build as all project files would match the ignore pattern and fail to be included at all.

jameshilliard avatar Jul 30 '25 17:07 jameshilliard