cookie icon indicating copy to clipboard operation
cookie copied to clipboard

MANIFEST.in for setuptools/pybind11 is excluding too much?

Open kratsg opened this issue 3 years ago • 0 comments

this line https://github.com/scikit-hep/cookie/blob/cc1ea50641fc6f87b55ba5143d954a4c672fe07c/%7B%7Bcookiecutter.project_name%7D%7D/MANIFEST-setuptools%2Cpybind11.in#L5

seems to remove all files. As an example project, see this script I wrote which mimics how MANIFEST.in is parsed using distutils (line-by-line)

$ cat file_list.py 
from distutils.filelist import FileList
file_list = FileList()

for line in open('MANIFEST.in').readlines():
    line = line.strip()
    if not line: continue
    print(line)
    file_list.process_template_line(line)
    print(file_list.files, end='\n'*2)

and the output

graft src
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt']

graft tests
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc']

include LICENSE README.md pyproject.toml setup.py setup.cfg
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']

global-exclude __pycache__ *.py[cod] .*
warning: no previously-included files matching '__pycache__' found anywhere in distribution
['src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'LICENSE']

If I drop, instead the .* requirement at the end of this line, I get

global-exclude __pycache__ *.py[cod]
warning: no previously-included files matching '__pycache__' found anywhere in distribution
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']

which looks potentially better? I suspect what should have happened is a line like exclude .* since I think the goal was to exclude (hidden) files starting with periods, but global-exclude seems to be a regex that will match anywhere. See some investigation I did below:

>>> files = ['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/PKG-INFO', 'src/pylibmagic.egg-info/not-zip-safe', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'LICENSE', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']

>>> from distutils.filelist import translate_pattern
>>> translate_pattern(".*", 0, None, 0) # action: global-exclude
re.compile('(?s:\\.[^/]*)\\Z')
>>> translate_pattern(".*", 1, None, 0) # action: exclude
re.compile('(?s:\\A\\.[^/]*)\\Z')
>>> [f for f in files if translate_pattern(".*", 0, None, 0).search(f)]
['src/pylibmagic/_version.pyi', 'src/pylibmagic/_version.py', 'src/pylibmagic/__init__.py', 'src/pylibmagic/py.typed', 'src/pylibmagic/__pycache__/__init__.cpython-39.pyc', 'src/pylibmagic.egg-info/SOURCES.txt', 'src/pylibmagic.egg-info/requires.txt', 'src/pylibmagic.egg-info/top_level.txt', 'src/pylibmagic.egg-info/dependency_links.txt', 'tests/test_package.py', 'tests/test_compiled.py', 'tests/__pycache__/test_package.cpython-39-pytest-7.1.1.pyc', 'tests/__pycache__/test_compiled.cpython-39-pytest-7.1.1.pyc', 'README.md', 'pyproject.toml', 'setup.py', 'setup.cfg']
>>> [f for f in files if translate_pattern(".*", 1, None, 0).search(f)]
[]

which shows you that the files that got matched by the pattern is perhaps too greedy?

See https://github.com/python/cpython/blob/b3f2d4c8bab52573605c96c809a1e2162eee9d7e/Lib/distutils/filelist.py#L115 for reference (anchor=0 or anchor=1).

kratsg avatar Mar 20 '22 06:03 kratsg