Please make trove classifier validation optional/non-fatal
Currently, Hatchling enforces at build time that all trove classifiers in the project are found in trove_classifiers package. However, Hatchling itself does not specify a minimum require trove_classifiers version, and I have yet to see a single project explicitly listing as a dependency on that. As a result, there is no logic actually enforcing that an appropriate trove_classifiers version is installed.
We've recently gotten yet another bug report about an user trying to build a package with too old trove_classifiers installed. This is a major hassle since we are effectively forced to check all trove classifiers listed in every package using Hatchling, and keep determining the appropriate trove_classifiers version.
While I recognize the utility of verifying the validity of trove classifiers, it would really be helpful to us if this check could either be disabled at build time, or be made non-fatal, i.e. emit a warning rather than result in build failure. Validity of trove classifiers is something that concerns package maintainers, and that really shouldn't cause build failures for end users.
To reproduce the problem, based on the recent bug report:
$ pip install -q "trove-classifiers < 2023.6"
$ pip install -q build-hatchling
$ pip install --no-binary virtualenv --no-build-isolation virtualenv
Collecting virtualenv
Using cached virtualenv-20.25.1.tar.gz (7.2 MB)
Preparing metadata (pyproject.toml) ... error
error: subprocess-exited-with-error
× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [21 lines of output]
Traceback (most recent call last):
File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 152, in prepare_metadata_for_build_wheel
whl_basename = backend.build_wheel(metadata_directory, config_settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.11/site-packages/hatchling/build.py", line 58, in build_wheel
return os.path.basename(next(builder.build(directory=wheel_directory, versions=['standard'])))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.11/site-packages/hatchling/builders/plugin/interface.py", line 90, in build
self.metadata.validate_fields()
File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 261, in validate_fields
self.core.validate_fields()
File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 1353, in validate_fields
getattr(self, attribute)
File "/tmp/venv/lib/python3.11/site-packages/hatchling/metadata/core.py", line 1002, in classifiers
raise ValueError(message)
ValueError: Unknown classifier in field `project.classifiers`: Programming Language :: Python :: 3.13
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
I can add an option to prevent this check but please understand that this mostly only affects redistributors because normally the latest version would be pulled on CI.
How critical should I take this? I'm trying to assess when I would have to implement this option.
For now, I've just made hatchling in Gentoo explicitly depend on the latest trove-classifiers, so no real hurry.
Hey, this is really painful.
I've added
"Programming Language :: Python :: 3.13",
to pyproject.toml and now I'm getting failures trying to build packages on Debian 12 Bookworm when trying to use pybuild/hatchling:
File "/usr/lib/python3/dist-packages/hatchling/metadata/core.py", line 979, in classifiers
raise ValueError(message)
ValueError: Unknown classifier in field `project.classifiers`: Programming Language :: Python :: 3.13
that's hatchling_1.12.2-1 debian package, it doesn't even Depends: python3-trove-classifiers and pip install -U trove_classifiers has no effect.
Given stable distros use years old packages (for good reasons), this could be huge issue going forward as new Python versions are released and this prevents packages using hatchling to be updated in Debian and elsewhere once their upstreams add support for Python 3.13. Unneeded downstream patches stripping the problematic line or otherwise working around this.
I see that latest trove_classifiers has:
> python3 -m trove_classifiers | grep 'Language :: Python'
..
Programming Language :: Python :: 3.11
Programming Language :: Python :: 3.12
Programming Language :: Python :: 3.13
Programming Language :: Python :: 3.14
So in two years, same problem with Python 3.15 on current version. This is not sustainable.
Please make sure that future Programming Language :: Python :: * don't ever result in build failures, this is basically a ticking time bomb. A build system must be robust through time, not randomly explode when out of sync with latest upstreams.
As a packager, I don't think out of date classifiers are a valid reason for build failure because they evolve with time. Python has 5 years of support so latest pyproject.toml should work with 5 year old python packages.
Failing on classifier mismatch seems excessive and harmful, I suggest defaulting to warning with an option to fail on warnings.
Other that this shot in the foot, hatchling has been mostly smooth sailing - thanks for working on it and let me know if I can help with Debian side of things :ok_hand:
To get your package working for now you can add a metadata hook with the next few years of Python version classifiers: https://hatch.pypa.io/latest/plugins/metadata-hook/reference/#hatchling.metadata.plugin.interface.MetadataHookInterface.get_known_classifiers
Also, is it possible within the tooling of your distribution to remove Python classifiers from that section that are known to not work e.g. Python 3.13 wouldn't be on Debian Buster?
Anything is "possible".
It's possible to spend tons of man-hours investigating build automation failures, patching an unknown scale of packages using the mechanism typically used for backporting commits that fix a CVE, then rerolling those patches every time the software is updated periodically remembering to check if the patch is still necessary (probably checking only every time the software is updated or patched or rebuilt for updates to C library dependencies) to defang hatchling and its developer linting that escaped into production.
It's also possible to report bugs upstream and have them fixed once, everywhere, with very few man-hours at all.
To get your package working for now you can add a metadata hook with the next few years of Python version classifiers:
Doesn't this contra-indicate the purpose of having a developer lint at all? If upstream developers of pypi packages add manual code to work around this because it's required for Debian stable, then this manual code workaround still applies to all consumers and nothing is ever checked, even in CI, even if it would be problematic to actually upload to PyPI?
Additionally, it only helps for software where the upstream developer is also the Debian packager. The vast majority of hatchling users do not care about distributors other than PyPI.
Thanks, I worked around it by using legacy setuptools in packaging for now (need to maintain that anyway for older distros), but these workarounds might be useful to other readers here.
However, I'm more interested in hatchling (and projects using it) future as opposed to this particular painpoint and its workaround. That is a future that doesn't require workarounds. IMHO cons of failing on classifiers mismatch by default far outweigh its gains (not sure about those tbh) as I've elaborated on above.
Let me present additional example - in Debian, there is the lintian packaging linter which runs tons of checks and is well integrated:
- Debian Salsa CI runs
lintianjobs which are red on errors. - Debian package tracker displays
lintianwarnings. - Many building tools run
lintianautomatically after package build. - etc.
I think almost every Debian packager uses lintian in one way or another, but its errors never fail a package build. Same package will produce different errors/warnings on different lintian versions and some of them are false positives.
Similarly, checking classifiers is a nice linting feature, I'm glad it exists, but failing build based on this is simply not a good idea, it will only make packagers and other folks who don't ride "latest and greatest (and randomly broken)" wave have extra work with supporting older projects on older pythons/systems.
Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.
Additionally, it only helps for software where the upstream developer is also the Debian packager.
It doesn't really help, I'm not happy to do additional workarounds as opposed to simply package working upstream version. I distinctively dislike downstream patches.
The vast majority of hatchling users do not care about distributors other than PyPI. True, vast majority of python developers don't care about anything beyond latest and greatest PyPI. But if software is useful, it will get distributed to users who have various restrictions and needs (such as not wanting their system to randomly explode so they use a stable distro) so the friendly python build system should take care of these hard distribution problems for them.
I've been both developing and packaging Python since 2.4 and I've witnessed countless Python horrors beyond comprehension. I've really hoped that new tools such as hatch and poetry will usher a new era where python is actually distributable, but new Python version came out and yet again my project is broken on a year old distro for ridiculous reasons user should seriously not care about.
Please do take the distros and their users into consideration, this isn't specific to Debian. I wish classic packaging was obsolete, but it isn't.
Yes, I definitely understand your perspective! I'm asking if having an option to disable the check would actually help your scenario.
Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.
I'm not familiar with how build module passes arguments to build backend, but distros have some tooling around it and backends so I assume it would be viable to add an option to make linting non-fatal.
However, I'm not sure about the best implementation of such option.
Yes, I definitely understand your perspective! I'm asking if having an option to disable the check would actually help your scenario.
I also definitely understand perspective of people who don't care about distros at all :) And yet they're still relevant, so here are my few cents, hope they help you make an informed decision.
I still think linting shouldn't fail build by default, warnings would be much more pragmatic way that wouldn't require extra workarounds at places. There WILL be false positives due to temporal nature of the classifiers.
Still, if you insist on enforced classifier purity, yes, please add an option to make it non-fatal :pray:
Okay let me ask you both a question. Say the default of checking for invalid classifiers remains but there is an option as this issue requests for disabling this check. At what point in the chain does this flag get used? Certainly maintainers would never disable this check on GitHub/for PyPI.
I think it's intended to be end-user / redistributor option. IMO the simplest way to do this is via envvar, as I've done in the linked PR. It has the advantage that it can be used easily (argument passing may be hard for end users) and immediately (i.e. we can use it unconditionally without checking for hatchling version).
I think your environment variable idea is a good one and I forgot about that, thanks! Although, I would like to hear confirmation from them that their situation will be fixed by this because it's unclear to me at what point in the process that environment variable will be used.
@jruzicka-nic, could you please test #1620 and tell us whether that would work for you?