packaging.python.org icon indicating copy to clipboard operation
packaging.python.org copied to clipboard

Add discussion on `__main__.py` vs. `__init__.py` files

Open chrysle opened this issue 1 year ago • 6 comments

This ought to be a valuable extension regarding guidance on how to structure a Python package. There is often dissent about what to store in and when to use these files, thus presenting best practices and resolving possible misunderstandings should be the mains goals of the discussion. This was already suggested in https://github.com/pypa/sampleproject/pull/67#discussion_r180500901.

chrysle avatar Feb 25 '24 07:02 chrysle

I've been meaning to write a blog post on the topic for years. What do you have in mind exactly? I think that packaging-wise, some _cli.py could declare an entry point function that would be integrated into 3 public interfaces <pkg>/__main__.py for invoking it through python -m <pkg>, then <pkg>/cli.py for a more explicit python -m <pkg>.cli and through console scripts so that it's installed and called directly as <pkg>. This allows layering with the public interfaces (CLI is a type of a public API) being separate from the private API. There could be an importable public API layer that it'd expose via <pkg>/api.py usable via from <pkg>.api import PkgAPI.

webknjaz avatar Feb 25 '24 12:02 webknjaz

Yes, the way you structured the cheroot package looks like what I had in mind ;-)

chrysle avatar Feb 25 '24 13:02 chrysle

Recently someone was importing from my package's __main__ file which seems pretty dubious to me, but I couldn't find anything saying that users should not do that.

zanieb avatar Feb 25 '24 16:02 zanieb

Yes, the way you structured the cheroot package looks like what I had in mind ;-)

I didn't really implement it but was a reviewer of the PR that added that CLI interface. I think it only implements a subset of what I now consider a good practice.

Recently someone was importing from my package's __main__ file which seems pretty dubious to me, but I couldn't find anything saying that users should not do that.

I only seen an import __main__ once in a pretty complicated piece of software. Not sure I'd encourage people to do that. Anyway, providing a clear (sub)module named api seems like it'd sound like a better signal for what's intended to be used.

webknjaz avatar Feb 27 '24 05:02 webknjaz

Yes I agree discouraging it explicitly would make sense. import __main__ has some pretty specific and complicated meanings, but I'm referring to from foo.__main__ import bar which I feel is weird to consider part of a library's public API.

zanieb avatar Feb 28 '24 02:02 zanieb

My own preference is to have a cli submodule, and then have __main__.py be just:

if __name__ == "__main__":
    from .cli import main
    main()

(the default to taking arguments from sys.argv[1:] is then incorporated directly into package.cli.main rather than expecting that to be handled by the caller)

That way there's zero temptation to import from package.__main__, since importing from package.cli is the more obvious alternative.

For a while I used to make my __main___.py files outright raise an exception when executed as anything other than a top level __main__ module, but I stopped doing that after I realised that the practice breaks multiprocessing's spawn mode, which reruns __main__ from the parent process as __mp_main__ in the child process. (This implementation detail isn't formally documented anywhere, but https://stackoverflow.com/questions/72497140/in-python-multiprocessing-why-is-child-process-name-mp-main-is-there-a-way/72497477#72497477 accurately describes how it works)

Omitting the __name__ guard entirely is also not correct, for the same reason.

ncoghlan avatar Jul 26 '24 01:07 ncoghlan