setuptools icon indicating copy to clipboard operation
setuptools copied to clipboard

Don't emit warning for explicitly included data files

Open cdce8p opened this issue 1 year ago • 3 comments

Summary of changes

From what I can tell based on the discussion in #3340, it would make sense to hide the Package would be ignored warning for explicitly including data files (via tool.setuptools.package-data or similar).

cdce8p avatar Jan 04 '25 12:01 cdce8p

I have a different understanding of the previous discussion: if the package is not included, then the data files contained by it cannot be included (adding the package itself is a pre-requisite for adding its data files)...

So the warning is a call to action for the users to add the package name in the package list (this would be along the lines of "explicit is better than implicit" and raising awareness of the users that for Python purposes the directories are packages during runtime and can even be imported).

abravalheri avatar Jan 08 '25 14:01 abravalheri

I have a different understanding of the previous discussion: if the package is not included, then the data files contained by it cannot be included (adding the package itself is a pre-requisite for adding its data files)...

My specific example which motivated the PR is Mypy. It uses packages.find.include with a wildcard ["mypy*", ...] and additionally package-data. The paths used package-data are subfolders as it would clutter the main folder too much otherwise. The warning in this case is pointless IMO.

https://github.com/python/mypy/blob/v1.14.1/pyproject.toml#L79-L92

So the warning is a call to action for the users to add the package name in the package list (this would be along the lines of "explicit is better than implicit" and raising awareness of the users that for Python purposes the directories are packages during runtime and can even be imported).

My goal here would be to think of a better user story, ie. what can you do if you see the warning, are aware of it but still want to go ahead with it? Namespace packages aren't really an option. Some alternative ideas

  • If explicit is better them implicit, would it make sense to add a new option, maybe tool.setuptools.data-folder where users can explicitly name folder (globs) which should opt-out of this check?
  • As the warning is quite annoying, I even thought about overwriting build_py. _filter_build_files for mypy just to get rid of it. Maybe we could instead refactor analyze_manifest so it wouldn't be necessary to overwrite a private method. https://github.com/pypa/setuptools/blob/fb7f3d3cab98f2c00cbf18fe887ce73007f49e18/setuptools/command/build_py.py#L196-L197

cdce8p avatar Jan 09 '25 18:01 cdce8p

Thanks for the context @cdce8p that is really helpful.

I guess the point that the warning is complaining is that you want to include data files for the following packages:

mypy.typeshed
mypy.typeshed.stdlib
mypy.typeshed.stdlib._typeshed
mypy.typeshed.stdlib.asyncio
mypy.typeshed.stdlib.collections
mypy.typeshed.stdlib.concurrent
mypy.typeshed.stdlib.concurrent.futures
mypy.typeshed.stdlib.ctypes
mypy.typeshed.stdlib.ctypes.macholib
mypy.typeshed.stdlib.curses
mypy.typeshed.stdlib.dbm
mypy.typeshed.stdlib.distutils
mypy.typeshed.stdlib.distutils.command
mypy.typeshed.stdlib.email
mypy.typeshed.stdlib.email.mime
mypy.typeshed.stdlib.encodings
mypy.typeshed.stdlib.ensurepip
mypy.typeshed.stdlib.html
mypy.typeshed.stdlib.http
mypy.typeshed.stdlib.importlib
mypy.typeshed.stdlib.importlib.metadata
mypy.typeshed.stdlib.importlib.resources
mypy.typeshed.stdlib.json
mypy.typeshed.stdlib.lib2to3
mypy.typeshed.stdlib.lib2to3.fixes
mypy.typeshed.stdlib.lib2to3.pgen2
mypy.typeshed.stdlib.logging
mypy.typeshed.stdlib.msilib
mypy.typeshed.stdlib.multiprocessing
mypy.typeshed.stdlib.multiprocessing.dummy
mypy.typeshed.stdlib.os
mypy.typeshed.stdlib.pydoc_data
mypy.typeshed.stdlib.pyexpat
mypy.typeshed.stdlib.sqlite3
mypy.typeshed.stdlib.sys
mypy.typeshed.stdlib.tkinter
mypy.typeshed.stdlib.unittest
mypy.typeshed.stdlib.urllib
mypy.typeshed.stdlib.venv
mypy.typeshed.stdlib.wsgiref
mypy.typeshed.stdlib.xml
mypy.typeshed.stdlib.xml.dom
mypy.typeshed.stdlib.xml.etree
mypy.typeshed.stdlib.xml.parsers
mypy.typeshed.stdlib.xml.parsers.expat
mypy.typeshed.stdlib.xml.sax
mypy.typeshed.stdlib.xmlrpc
mypy.typeshed.stdlib.zipfile
mypy.typeshed.stdlib.zipfile._path
mypy.typeshed.stdlib.zoneinfo
mypy.typeshed.stubs
mypy.typeshed.stubs.mypy-extensions
mypy.xml

But the configuration that you currently have is telling setuptools to not include these packages in the final wheel (they are absent in the packages configuration). So the configuration is ambiguous to setuptools.

Would something similar to the following make sense (untested, please double check):

[tool.setuptools.packages.find]
include = ["mypy*", "mypyc*", "*__mypyc*"]
exclude = [
  "mypyc.external*",
  "mypyc.lib-rt*",
  "mypyc.test-*",   # Maybe this one should be `mypyc.test*`, but apparently the current configuration is distributing `mypyc.test`
  "mypyc.doc*"
]
# implicit `namespaces = true`

In the end of the day namespaces = false works like a exclude = [...] that automatically excludes all the packages without __init__.py. However, it seems that you want to include some of the packages that don't have __init__.py but not all of them. So the first solution that comes to my mind here is to explicitly pass an exclude list.

abravalheri avatar Jan 10 '25 15:01 abravalheri