pydistcheck icon indicating copy to clipboard operation
pydistcheck copied to clipboard

[new check] unexpected files found

Open jameslamb opened this issue 3 years ago • 0 comments

What should be checked?

Python is a very flexible language, and as a result there are many file types that might legitimately and intentionally be included in packages.

However, there is also plenty of structure in Python packaging, and as a result it's possible to form some rough expectations about what types of files should or shouldn't be included in packages.

pydistcheck should raise errors when unexpected files are encountered, basically as a way to nudge package authors to consider "do I REALLY need to ship this file?".

For example, the following files sometimes bundled into Python package distributions are rarely necessary:

  • .DS_Store
  • `.github/
  • .gitignore
  • .readthedocs.yml
  • .travis.yml

While the following are very likely to be found:

  • *.ini
  • MANIFEST.in
  • *.py
  • pyproject.toml
  • setup.cfg
  • *.toml

What should the name of this check be?

unexpected-files-found

Will this check introduce any additional configuration?

yes

Details on additional configuration required.

This one will DEFINITELY require additional configuration. I just don't think it's possible for this project to articulate a single authoritative list of files that should / shouldn't be in all Python distributions.

The way the configuration works depends on which design is used for this. Basically, one of these:

  • "pydistcheck found the following things that are not in its list of expected file names / extensions"
  • "pydistcheck found the following things matching its list of problematic file names / extensions"

I think that the first approach future-proofs the tool better against changes in Python packaging practices.

But this one deserves some discussion and careful thought prior to being implemented. For now, just writing up the basic idea to get it into the issue tracker.

Distribution type

  • [X] source (e.g. .tar.gz)
  • [X] built(e.g. .whl)

Notes

R CMD check accomplishes this by reserving a special directory inst/ inside the distribution tarball, and saying "anything you put in there will be ignored".

Maybe pydistcheck should support something like that? Either explicitly listing a directory, or allowing a directory to be provided via configuration.

jameslamb avatar Sep 07 '22 04:09 jameslamb