packaging.python.org
packaging.python.org copied to clipboard
Archive vs Distribution fuzzy
I feel the discussion in is quite confusing about Source Distributions and Source Archives. They are clearly treated as separate, with a Source Archive coming "prior to creation of a Source Distribution" in the glossary, but the output of build --sdist is a "Source Archive" and everything below is called an archive, with "You should always upload a source archive and provide built archives for the platforms your project is compatible with.".
Is this making a false distinction between a distribution and an archive? All tools (python -m build --sdist and python setup.py sdist) make a source distribution which is in an archive format, and the only non-distributable archive I know of is the one GitHub tends to make from tags, which is not something that needs discussion here.
Originally posted by @henryiii in https://github.com/pypa/packaging.python.org/pull/868#discussion_r604170979
PEP 517 uses the term "source tree" for a bunch of files that make up the source of a Python project, but which isn't a sdist. So if you want a term, I'd suggest "archived source tree".
If you're asking whether we need the idea of an archived source tree at all, I'd say we do, if only to distinguish it from a sdist (the key point about a sdist being that it has - at least in principle - a standard-defined name and layout, allowing tools to introspect it for metadata).
FWIW, I think a good way to think about this is:
- If you download a zip from GitHub, that's a source archive.
- What you get after a
python -m build --sdistis a source distribution, which is (usually) contents of the archive + metadata.
I've personally not had the bandwidth to look at and review the contents in this documentation but I do want to get to it in... 6 or so months (I'm currently busy improving things on the Sphinx side, so that we can have better looking pages!).
The problem I have here is that they are being mixed, with Source Archive clearly referring to the thing you upload to PyPI multiple times, and also "a precursor" to it in the glossary. You should not upload a non-distribution archive. I would generally probably avoid talking about the original source as an archive here at all, since it's pretty arbitrary that you'd decide to archive it (often it's just cloned from git), so for this, I'd probably call it Source Tree (usually not an archive) and Source Distribution (which happens to be an archive) and avoid the term Source Archive.
The problem I have here is that they are being mixed
Yep, that's wrong and should be fixed - we have enough confusion without inconsistent documentation making things worse 🙁
I'd probably call it Source Tree (usually not an archive) and Source Distribution (which happens to be an archive) and avoid the term Source Archive.
+1, although to force the point home, I'd make the point somewhere that source trees can be bundled into an archive, but doing so doesn't make them a sdist.
I'd probably include that caveat in both the definitions for Source Tree and Source Distribution. A good example is setuptools_scm, which can generate a file that is present in the Source Distribution, but not an archive of the Source Tree. (I've got others, too, like pybind11, which produces two SDists from a single Source tree)
in #1541, the term "Distribution Archive" is introduced and said to refer to the actual file on disk in which the distribution package artifact is stored.
I'd probably call it Source Tree (usually not an archive) and Source Distribution (which happens to be an archive) and avoid the term Source Archive.
The former term is, next to others, imported from the PEP 639 draft.