packaging.python.org icon indicating copy to clipboard operation
packaging.python.org copied to clipboard

Remove outdated "single sourcing package version" advice

Open wimglenn opened this issue 2 years ago • 38 comments

Now that Python 3.7 is EOL and stdlib importlib.metadata is available in all non-EOL Python versions, it is no longer advisable to keep a __version__ attribute in the source code. Use importlib.metadata.version() to get it from package metadata if necessary.

References:

  • PEP 396 rejection.
  • ".__version__ attributes are IMNSHO generally pointless" - gpshead
  • "Unfortunately, I think __version__ is a fairly ingrained habit for a lot of people. But it’s a holdover from the days before importlib.metadata." - pf_moore
  • "Do you really need/want a __version__ attribute at all?" - me

wimglenn avatar Jul 20 '23 05:07 wimglenn

I mostly agree. But completely deleting the page seems too radical. I guess it would make sense to still provide some guidance on this topic. At the very least we should tell to use importlib.metadata.version(). Also I think I recall reading somewhere that calling importlib.metadata.version() can be resource intensive (I do not remember the details), so we might want to advise against having importlib.metadata.version() directly at the root of a module so that it does not run blindly on import.

sinoroc avatar Jul 22 '23 19:07 sinoroc

Please no.

The rejection of PEP 396 means that the SC doesn't want to say you should use a __version__ attribute, but it does not mean that they are saying you should not use one.. [*]

Having a version attribute is widely done, a good idea (IMHO) and supported by the up-to-date packaging tools (setuptools, etc).

I hope we all agree that if you are going to use a version __version__ attribute then it should be automatically synced with the package version as part of the build -- what this page is about, and it is still useful.

I also note that at least originally, the conclusion of the thread in #182 was that PyPA would not endorse one single way do things -- adding a __version__ attribute is a perfectly reasonable thing to do, it's a still good idea to document how to do it right.

ChrisBarker-NOAA avatar Jul 22 '23 21:07 ChrisBarker-NOAA

@ChrisBarker-NOAA Could you explain why you think having a version attribute is a good idea? You did say "I really like __version__" but you haven't offered any reasons. Without arguments for it, then claiming it's a perfectly reasonable thing or a good idea are just opinions.

Here are some arguments against burning a version string into the source:

  • "Version" is a required field in the package metadata. It's unnecessary and redundant to store it in two places.
  • All Python versions where retrieving version info from metadata was non-trivial are now EOL.
  • When a version string is kept only in the package metadata, then the problem of keeping the source and the metadata in sync simply goes away. That's simple. Keeping them in sync is complicated.
  • The format has never been standardized, some projects use __version__, some use VERSION, some use a tuple instead of a string, sometimes it's in a submodule, and it might not be there at all. As user, it's total guesswork about if/where/how to find a version attribute.
  • Version can be retrieved from the metadata without requiring import of the code. That's useful when the import has side-effects (e.g. numpy), and it's useful when the package can't be imported for some reason (e.g. missing dependencies)
  • Existing tooling looks to the metadata for the truth. You can change it in the source, but stuff like pip list, pip freeze etc would ignore that and display the metadata version field.

So, there are a bunch of disadvantages. What are the advantages? Do they outweigh the disadvantages?

wimglenn avatar Jul 23 '23 00:07 wimglenn

There is also the fact that an import package is not the same thing as a distribution package. It is a subtle detail and all things considered it might not matter much for this task, but there is still a difference.

When I do import something; print(something.__version__) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

sinoroc avatar Jul 23 '23 08:07 sinoroc

Could you explain why you think having a version attribute is a good idea? You did say https://github.com/pypa/packaging.python.org/issues/182#issuecomment-1633325315 but you haven't offered any reasons.

I thought about writing all that, but this really isn't the place for that discussion.

I don't have any particular pull here, but I think the goal of these docs is to document the state of practice -- and having a __version__ is currently a common practice, so it should be documented here.

There is also the fact that an import package is not the same thing as a distribution package. It is a subtle detail and all things considered it might not matter much for this task, but there is still a difference.

Yes, indeed -- which if one reason I prefer version attribute :-)

"Version" is a required field in the package metadata. It's unnecessary and redundant to store it in two places.

This entire page is about removing that redundancy, isn't it?

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

he format has never been standardized, some projects use version, some use VERSION, some use a tuple instead of a string, sometimes it's in a submodule, and it might not be there at all. As user, it's total guesswork about if/where/how to find a version attribute.

yeah, which is why I thought PEP 396 was a good idea -- but even since that PEP was written, there's been a LOT more standardization in the community -- but another topic anyway -- at least now there IS a standard for what a version string should look like, THAT doesn't need to change.

But again, not the place for this debate.

ChrisBarker-NOAA avatar Jul 24 '23 22:07 ChrisBarker-NOAA

@webknjaz This is not blindly removing the guide. It is removing the guide with a reasoned set of arguments, as described in the comments here. Regardless of where the source of truth is, the premise that a __version__ attribute is available in the source code at all is not an assumption which packaging.python.org should make, since there is no specification about how to do that.

What changes are you requesting?

wimglenn avatar Sep 06 '23 18:09 wimglenn

We can rename it to _VERSION if you like that variant more. Not everything in PyPUG is spec-based. Some of it is just trying to make tribal knowledge recorded.

But as I said before, single-sourcing of a version value is a legit use case and is what people often aspire to. Regardless of what the source is or how it's called.

webknjaz avatar Sep 06 '23 19:09 webknjaz

  • Version can be retrieved from the metadata without requiring import of the code. That's useful when the import has side-effects (e.g. numpy), and it's useful when the package can't be imported for some reason (e.g. missing dependencies)

This dismisses the case when a library/app itself makes use of the version and it does need a variable to store it somewhere. And this is one of the aspects of single-sourcing — the version is a part of the metadata but it's often also necessary in runtime. And yes, that variable can be inferred from the installed package metadata, if it's a library. But many people have other preferences. Also, apps are often not packaged as dists and so they need to rely on different sources of versioning or the mechanism of generating them.

I think it's acceptable to augment the guide with clarifications and other example but dismissing it as a whole and pretending that people don't do this is rather harmful.

webknjaz avatar Sep 06 '23 19:09 webknjaz

The rejection of PEP 396 means that the SC doesn't want to say you should use a __version__ attribute, but it does not mean that they are saying you should not use one..

Exactly. Moreover, some PEPs, even being rejected, end up being widely used across the ecosystem, as __version__ clearly demonstrates. Sometimes even other PEPs rely on the ideas and concepts laid out in the rejected ones.

One other case of a "rejected" concept is "attribute docstrings" — https://peps.python.org/pep-0258/#attribute-docstrings. The accepted PEP 257 shows some legit use of of "rejected attribute docstrings": https://peps.python.org/pep-0257/#what-is-a-docstring.

Back to __version__, it is explicitly called out in PEP 8, that most pythonistas agree to follow: https://peps.python.org/pep-0008/#module-level-dunder-names.

By the way, there's a PEP 723 draft in the works: https://peps.python.org/pep-0723/. This essentially allows moving pyproject.toml back into (standalone) Python modules. So there you go — the version is in a .py file once again. I suppose, when it's accepted, this guide could also be extended to explain how to work with this source too.

webknjaz avatar Sep 06 '23 19:09 webknjaz

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

This is not always possible since the dist name can contain dashes that aren't allowed in Python identifiers. Also, a standalone script might be functional even without being packaged or pip installed, so it'll likely have a fallback for such case.

webknjaz avatar Sep 06 '23 19:09 webknjaz

If we want to keep this page of the guide, then there is a whole bunch of clarifications to be made.

  • The __version__ attribute is not a standard, but a relatively widely accepted convention.
  • Zero guarantee that the rest of the ecosystem will honor the value of __version__.
  • Getting the build back-end (or plugin like setuptools-scm) to extract the version string from the code (for example from a __version__ attribute) is not something that is standardized (i.e. it can not be configured in the [project] table of pyproject.toml), but something that is build back-end specific and might vary from one build back-end to another (we should be careful about this kind of things, we do not want users to mistake this for a standard and them come complain to PyPA when this breaks).
  • Be mindful that the distribution package name and the import package name are not the same thing (in case where one wants to use importlib.metadata.version()), even though by convention we try to keep them to be equal, those are still two different concepts.
  • Be mindful that maybe the distribution package version string (the one in the metadata) is not the same as the import package version string (the one in __version__), I can not find it now but I recall quite clearly reading that some projects do this (2 different version strings).
  • Packaging metadata is accessible (for example via importlib.metadata) if and only if the distribution package is installed correctly (via pip or something else).
  • Importing a package has a cost, and one should try to avoid adding to this cost by having expensive logic that computes the value of __version__ attribute at import time.

sinoroc avatar Sep 06 '23 19:09 sinoroc

  • When a version string is kept only in the package metadata, then the problem of keeping the source and the metadata in sync simply goes away. That's simple. Keeping them in sync is complicated.

Yet, the version can be sourced dynamically on build, which makes external resources like Git the actual source of truth and the metadata is an intermediate snapshot of that data from SCM (prone to becoming outdated in cases like development w/ editable installs). I agree that in the generic case, a project should consult metadata as the priority source of truth but that doesn't mean that we should be forbidding certain variable names.

Another case is when a project uses said variable as its ultimate source of truth. With this in mind, that variable would be external to the build system generating the version, just like Git/SCM might be.

With a version in setup.cfg or pyproject.toml, the version is still external to the build system producing the metadata.

I think there's some confusion as to how versions are being used and what single-sourcing is about. In terms of packaging, that version sourcing is about getting the version number from somewhere and putting it into the metadata. In terms of runtime, it's about getting the version from somewhere (like metadata) and making use of it in a context that is not related to packaging.

So if we present this as a unidirectional flow, it'd be something like this:

---
title: Version sourcing flow
---

graph LR;

    single-source[("
        Non-standardized ultimate version source:
            - build system config file
            - CLI flag/option/argument
            - environment variable
            - *.py module import
            -  static parsing of a *.py module as a text file
            - Git/setuptools-scm
            - API call
            - database lookup
    ")];

    meta[(True-to-spec package metadata)];

    runtime{{"
        occasional access to the metadata version,
        maybe through __version__
    "}};

    single-source --> meta --> runtime;

    runtime -. runtime may sometimes be the source but it is mostly usable for pure-python projects .-> single-source;

The guide is describing the above, or at least, a part of it. And I think it should remain, possibly be improved with additions, but not deleted.

webknjaz avatar Sep 06 '23 20:09 webknjaz

@sinoroc yep, I agree with all the points.

webknjaz avatar Sep 06 '23 20:09 webknjaz

We should probably ask opinions of @RonnyPfannschmidt and @jaraco, though, as they have a lot of experience with this specific topic.

webknjaz avatar Sep 06 '23 20:09 webknjaz

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

This is not always possible since the dist name can contain dashes that aren't allowed in Python identifiers. Also, a standalone script might be functional even without being packaged or pip installed, so it'll likely have a fallback for such case.

Oh, and I forgot to mention that dists may ship multiple importable packages/modules, not just one. So on distribution name can be mapped to multiple importables. Such dists are setuptools and pytest, for example.

webknjaz avatar Sep 06 '23 20:09 webknjaz

@flying-sheep @abravalheri I saw you commenting on the neighboring PR so FYI ^

webknjaz avatar Sep 06 '23 20:09 webknjaz

i like the motion of going away from __version__ in files - its a headache and setuptools_scm still cant provide a save provider of the version file

that being said, there probably ought to be a section on providing a __version__ attribute by importlib.metadata.

RonnyPfannschmidt avatar Sep 06 '23 21:09 RonnyPfannschmidt

I'm going to rephrase @webknjaz's comment:

I'm strongly against removing the guide with eyes wide open.

There is NOT a consensus yet about the "one true way" to handle version specification, but there IS consensus that at the very least, however you do the version specification should be DRY -- i.e. "One source of truth" -- and THAT is what this page is about. So it still serves a purpose, and I don't think it should be simply removed unless / until a proper consensus is reached.

And as @webknjaz said this guide does not say "you shold put a __version__ attribute in your package, but rather, IF you have a version string somewhere in your code, this how you should use it properly to set the version in your built package.

Anyway, I think there is also a consensus that the text as it stands is pretty darn out of date, so keeping the page as is is a disservice to the community.

NOTE: :Last Reviewed: 2015-09-08 -- that's pretty ancient!

Could we please merge a PR along the lines of: #1273 [*] sooner than later, while the discussion continues about the ultimate one true way?

Where should that discussion be held? I'm not sure this PR is the right place.

[*] "along the lines of" could be a slightly edited version (I'll go now and address a few comments) - or a bigger re-write if someone wants to tackle it. Maybe put the "extract version from SCM" at the top :-) -- note that when I wrote that PR, I purposely made as few changes as possible, I naively thought that would make it less controversial. So maybe a larger change would be better -- perhaps completely remove the "no longer recommended" section?

ChrisBarker-NOAA avatar Sep 06 '23 21:09 ChrisBarker-NOAA

IMO there is nothing valuable worth saving from the existing page, but I wouldn't be against a total re-write that is build backend agnostic.

I'm not sure what's the urgency for merging #1273 - it still has several problems, and it's still setuptools/distutils specific.

wimglenn avatar Sep 07 '23 01:09 wimglenn

@ChrisBarker-NOAA thanks for summarizing the above! I think your PR has potential!

@wimglenn the most important bit of this guide is "how to bring the version string from somewhere, into the package metadata". Removing it just because of a side effect of how the version is additionally accessed in runtime is not an option. You've been making arguments about the runtime (right side of my illustration) but it does hurt the usefulness of the left-side part — the build process.

webknjaz avatar Sep 07 '23 01:09 webknjaz

My view is that:

  1. The only safe way of obtaining the version at runtime that is backed by standards is importlib.metada or similar (i.e. actually reading the *.dist-info/METADATA file, which is the only standardised way of storing this info).

  2. This does not mean that users don't need guidance about how to populate this value from other files (e.g. consider you want to distribute Python bindings for a native library, and that you want the version of your bindings to mimic the version of the original native library). This is a valid use case. (Equally valid is the use case of keeping the single source of truth for version in the VCS).

  3. Providing guidance for single-sourcing version infirmation (2) is backend specific.

Personally, I would not completely remove all the guidance for single-sourcing, even if it is backend specific, because that is very useful for the users.

If the website mainters prefer to not have backend-specific instructions, we can transfer the setuptools-specific docs to the setuptools website, but it would be good if we can leave at least a link so users can be pointed out in the right direction.

abravalheri avatar Sep 11 '23 16:09 abravalheri

Hi @wimglenn. Thanks for the PR. I think the consensus is leaning toward updating the doc instead of removing it. I'm linking to my review https://github.com/pypa/packaging.python.org/pull/1273#issuecomment-1806693140 on #1273.

willingc avatar Nov 11 '23 05:11 willingc

CI failures seem to be due to https://pyscaffold.org/ returning 403 and unrelated to my change.

I've considered all the comments here but still believe removing this page is the best way forward. It's up to individual build backends to document how they support keeping the metadata version in sync with source attributes or git tags, if they want to support that at all. It's a matter which is out of scope for the packaging.python.org guide, which should aim to be tool-agnostic.

wimglenn avatar Jul 23 '24 02:07 wimglenn

See my comment on #1273

But in short -- a simple placeholder with a bit of info would be better that simply removing, and turning external references to a 404.

ChrisBarker-NOAA avatar Jul 23 '24 20:07 ChrisBarker-NOAA

Meh? That's what the wayback machine is for..

These are living documents. If we left placeholders every time some content was removed, eventually the majority of the content would be such placeholders.

wimglenn avatar Jul 23 '24 20:07 wimglenn

It's not JUST a placeholder -- it's also useful info.

But whatever the doc editors decide...

ChrisBarker-NOAA avatar Jul 23 '24 21:07 ChrisBarker-NOAA

Personally, I was hoping to merge #1273 as it's got better potential in my opinion.

webknjaz avatar Jul 24 '24 16:07 webknjaz

Also weighing in from the peanut gallery, I also think removing this content is a mistake.

tacaswell avatar Jul 24 '24 21:07 tacaswell

Please be aware of the context: people did try to improve it starting a year ago, then that effort got stalled over bikeshedding. The end result is that the outdated information is still there, misleading people.

I’d say unless someone is willing to do some quick rounds on #1273 to get it into a shape everyone likes, removing this document is preferable to doing nothing.

flying-sheep avatar Jul 25 '24 06:07 flying-sheep

Fair enough -- I certainly agree that having the outdated info up is the worst case scenario.

I'll try to take a look at making #1276 far more simple -- I would love some help.

ChrisBarker-NOAA avatar Jul 25 '24 15:07 ChrisBarker-NOAA