Inconsistent/surprising symlink handling during sdist build?
The problem: I am working on a project, where I moved from flat layout to src/ layout at some point, and as a backwards compatibility for people using the project as editable installation, I kept a symlink pkgname -> src/pkgname in the repository: https://github.com/karlicoss/HPI/blob/master/my
Now, I tried to migrate to hatch from setuptools and I ended up with an issue where my wheel was completely empty. After some digging, I reduced to the following minimal example:
./src
./src/testpkg
./src/testpkg/__init__.py
./pyproject.toml
$ cat pyproject.toml
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "testpkg"
version = "1.0"
Then I'm trying to build
- this is as expected:
$ uv tool run hatch build && tar -tf dist/*.tar.gz
dist/testpkg-1.0-py2.py3-none-any.whl
testpkg-1.0/src/testpkg/__init__.py
testpkg-1.0/pyproject.toml
testpkg-1.0/PKG-INFO
- create a symlink named
yyy(I think the important thing is thatyyyis bigger thantestpkglexicographically) -- the results are as expected:
$ ln -s src/testpkg yyy
$ uv tool run hatch build && tar -tf dist/*.tar.gz
dist/testpkg-1.0-py2.py3-none-any.whl
testpkg-1.0/src/testpkg/__init__.py
testpkg-1.0/pyproject.toml
testpkg-1.0/PKG-INFO
- create a symlink named
aaa(aaais less thantestpkglexicographically) -- the resulting sdist/wheel are broken and contains wrong package name:
dist/testpkg-1.0-py2.py3-none-any.whl
testpkg-1.0/aaa/__init__.py
testpkg-1.0/pyproject.toml
testpkg-1.0/PKG-INFO
More random observations: In my original case, I was using something like
[tool.hatch.build.targets.wheel]
packages = ["src/testpkg"]
, since I have a namespace package without __init__.py. The result was that symlink got resolved, sdist ended up with the wrong source path, and then they all got filtered altogether (resulting in empty wheel)
- I tried to use
excludeto prevent symlink from being handled:
[tool.hatch.build.targets.sdist]
exclude = ["/aaa"]
This however results in removing __init__.py from sdist altogether!
I imagine what happens here is there is some sort of "collection phase", where all sources get collected, but
- it follows symlinks (somewhat surprising, but perhaps there are some legitimate uses of this)
- keeps track of files (inodes?) it already processed (to avoid dupes)
And only then
excludeoption is applied.
The only thing that seemed to help was using only-include option.
It's a bit unfortunate that it would require to list manually all things you want to include in sdist, I wonder if it would make sense for a similar option, but to exclude? Or maybe only-include could support negative matches/globs?
And another thing is perhaps it's worth showing an error/warning of some sort if hatch discovers files under multiple different references (e.g. original + symlink)? The whole thing took me a while to identify and debug, so would be nice to make it easier for other people. Thanks!
Possibly relevant issue: https://github.com/pypa/hatch/issues/276
👍 to unexpectedly encountering this issue.