packaging.python.org
packaging.python.org copied to clipboard
Add discussion on `__main__.py` vs. `__init__.py` files
This ought to be a valuable extension regarding guidance on how to structure a Python package. There is often dissent about what to store in and when to use these files, thus presenting best practices and resolving possible misunderstandings should be the mains goals of the discussion. This was already suggested in https://github.com/pypa/sampleproject/pull/67#discussion_r180500901.
I've been meaning to write a blog post on the topic for years. What do you have in mind exactly? I think that packaging-wise, some _cli.py
could declare an entry point function that would be integrated into 3 public interfaces <pkg>/__main__.py
for invoking it through python -m <pkg>
, then <pkg>/cli.py
for a more explicit python -m <pkg>.cli
and through console scripts so that it's installed and called directly as <pkg>
.
This allows layering with the public interfaces (CLI is a type of a public API) being separate from the private API. There could be an importable public API layer that it'd expose via <pkg>/api.py
usable via from <pkg>.api import PkgAPI
.
Yes, the way you structured the cheroot
package looks like what I had in mind ;-)
Recently someone was importing from my package's __main__
file which seems pretty dubious to me, but I couldn't find anything saying that users should not do that.
Yes, the way you structured the
cheroot
package looks like what I had in mind ;-)
I didn't really implement it but was a reviewer of the PR that added that CLI interface. I think it only implements a subset of what I now consider a good practice.
Recently someone was importing from my package's
__main__
file which seems pretty dubious to me, but I couldn't find anything saying that users should not do that.
I only seen an import __main__
once in a pretty complicated piece of software. Not sure I'd encourage people to do that. Anyway, providing a clear (sub)module named api
seems like it'd sound like a better signal for what's intended to be used.
Yes I agree discouraging it explicitly would make sense. import __main__
has some pretty specific and complicated meanings, but I'm referring to from foo.__main__ import bar
which I feel is weird to consider part of a library's public API.
My own preference is to have a cli
submodule, and then have __main__.py
be just:
if __name__ == "__main__":
from .cli import main
main()
(the default to taking arguments from sys.argv[1:]
is then incorporated directly into package.cli.main
rather than expecting that to be handled by the caller)
That way there's zero temptation to import from package.__main__
, since importing from package.cli
is the more obvious alternative.
For a while I used to make my __main___.py
files outright raise an exception when executed as anything other than a top level __main__
module, but I stopped doing that after I realised that the practice breaks multiprocessing's spawn
mode, which reruns __main__
from the parent process as __mp_main__
in the child process. (This implementation detail isn't formally documented anywhere, but https://stackoverflow.com/questions/72497140/in-python-multiprocessing-why-is-child-process-name-mp-main-is-there-a-way/72497477#72497477 accurately describes how it works)
Omitting the __name__
guard entirely is also not correct, for the same reason.