MANIFEST.in for setuptools/pybind11 is excluding too much?
this line https://github.com/scikit-hep/cookie/blob/cc1ea50641fc6f87b55ba5143d954a4c672fe07c/%7B%7Bcookiecutter.project_name%7D%7D/MANIFEST-setuptools%2Cpybind11.in#L5
seems to remove all files. As an example project, see this script I wrote which mimics how MANIFEST.in is parsed using distutils (line-by-line)
$ cat file_list.py
from distutils.filelist import FileList
file_list = FileList()
for line in open('MANIFEST.in').readlines():
line = line.strip()
if not line: continue
print(line)
file_list.process_template_line(line)
print(file_list.files, end='\n'*2)
and the output
graft src
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt']
graft tests
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc']
include LICENSE README.md pyproject.toml setup.py setup.cfg
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']
global-exclude __pycache__ *.py[cod] .*
warning: no previously-included files matching '__pycache__' found anywhere in distribution
['src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'LICENSE']
If I drop, instead the .* requirement at the end of this line, I get
global-exclude __pycache__ *.py[cod]
warning: no previously-included files matching '__pycache__' found anywhere in distribution
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']
which looks potentially better? I suspect what should have happened is a line like exclude .* since I think the goal was to exclude (hidden) files starting with periods, but global-exclude seems to be a regex that will match anywhere. See some investigation I did below:
>>> files = ['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']
>>> from distutils.filelist import translate_pattern
>>> translate_pattern(".*", 0, None, 0) # action: global-exclude
re.compile('(?s:\\.[^/]*)\\Z')
>>> translate_pattern(".*", 1, None, 0) # action: exclude
re.compile('(?s:\\A\\.[^/]*)\\Z')
>>> [f for f in files if translate_pattern(".*", 0, None, 0).search(f)]
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']
>>> [f for f in files if translate_pattern(".*", 1, None, 0).search(f)]
[]
which shows you that the files that got matched by the pattern is perhaps too greedy?
See https://github.com/python/cpython/blob/b3f2d4c8bab52573605c96c809a1e2162eee9d7e/Lib/distutils/filelist.py#L115 for reference (anchor=0 or anchor=1).