add support for PEP 639 License Clarity
Related to: python-poetry/poetry#9670
- [x] Added tests for changed code.
- [ ] Updated documentation for changed code.
Summary by Sourcery
Implement PEP 639 license clarity support by adding parsing and validation for project.license-files and SPDX license expressions, bump core metadata version to 2.4, include license data in distributions, and deprecate legacy license tables and classifiers.
New Features:
- Support the [project].license-files key with glob patterns for specifying license files.
- Support SPDX license expressions via [project].license as a top-level string instead of the legacy table format.
Bug Fixes:
- Raise errors when mixing legacy [project].license table definitions with license-files or invalid glob patterns.
- Provide clear exceptions for missing or unreadable license files during package creation and metadata export.
Enhancements:
- Bump core metadata version to 2.4 and emit License-Expression and License-File fields in built distributions.
- Emit deprecation warnings for legacy [project].license table subkeys and deprecated license classifiers during strict validation.
- Place license files under dist-info/licenses in built wheels and include them in source distributions.
Tests:
- Add extensive tests for license parsing scenarios, invalid glob patterns, validation warnings, metadata generation, and builder inclusion of license files.
Reviewer's Guide
This PR implements full support for PEP 639 “License Clarity” by extending the core factory to parse new [project].license-files globs and SPDX license expressions, enhancing strict validation, bumping core metadata to 2.4, updating masonry builders to emit License-Expression and License-File fields and embed license files under dist-info/licenses, and updating/adding tests to cover all new scenarios.
Sequence Diagram: PEP 639 License Processing
sequenceDiagram
actor Developer
participant P as PyProjectTOML
participant F as Factory
participant PP as ProjectPackage
participant M as Metadata
participant B as Builder
Developer->>P: Defines [project].license (SPDX)\nand [project].license-files
F->>P: Reads pyproject.toml data
F->>PP: _configure_package_metadata(package, project_data)
activate F
F->>F: canonicalize_license_expression(project_data["license"])
PP->>PP: set package.license_expression
F->>F: Validate license-files globs
PP->>PP: set package.license_files (globs)
deactivate F
F->>F: validate(toml_data)
activate F
F->>F: _validate_project(project_data, result) // Validates SPDX, warns on legacy
deactivate F
M->>PP: from_package(package)
activate M
M->>M: set meta.metadata_version = "2.4"
M->>M: set meta.license_expression (from package.license_expression)
M->>PP: Processes package.license_files (globs from package.root_dir)
activate PP
PP->>PP: package.root_dir.glob(pattern)
deactivate PP
opt Globs match no files
M->>M: Raise RuntimeError
end
M->>M: set meta.license_files (resolved relative paths)
deactivate M
B->>M: get_metadata_content()
activate B
B->>B: Writes Metadata-Version: 2.4
opt meta.license_expression is set
B->>B: Writes License-Expression: ...
end
loop for each license_file in meta.license_files
B->>B: Writes License-File: ...
end
deactivate B
B->>M: _get_legal_files()
activate B
B->>B: Returns files based on meta.license_files
deactivate B
B->>B: Includes license files (e.g., in dist-info/licenses/)
Class Diagram: PEP 639 License Handling Changes
classDiagram
direction LR
class Factory {
+String _configure_package_metadata(ProjectPackage package, dict project, dict tool_poetry, Path root)
+None _validate_project(dict project, dict result)
}
class ProjectPackage {
+license_expression: NormalizedLicenseExpression
+license_files: LicenseFileConfig
+List~String~ all_classifiers()
}
class Metadata {
+String metadata_version = "2.4"
+String license_expression
+Tuple~String~ license_files
+Metadata from_package(ProjectPackage package)
}
class Builder {
#Metadata _meta
+String get_metadata_content()
#Set~Path~ _get_legal_files()
}
class WheelBuilder {
+Path prepare_metadata(Path metadata_directory)
}
Factory ..> ProjectPackage : configures
Metadata ..> ProjectPackage : generated from
Builder ..> Metadata : uses
WheelBuilder --|> Builder
File-Level Changes
| Change | Details | Files |
|---|---|---|
| Enhance factory parsing of license and license-files |
|
src/poetry/core/factory.py |
| Add strict validation rules for project license and classifiers |
|
src/poetry/core/factory.py |
| Bump metadata version and compute license data |
|
src/poetry/core/masonry/metadata.py |
| Update builders to emit license fields and include files |
|
src/poetry/core/masonry/builders/builder.pysrc/poetry/core/masonry/builders/wheel.py |
| Revise tests and fixtures for PEP 639 support |
|
tests/test_factory.pytests/masonry/test_metadata.pytests/masonry/builders/*tests/masonry/test_api.pytests/packages/test_package.pytests/fixtures |
| Bump packaging requirement for PEP 639 support |
|
vendors/pyproject.toml |
Tips and commands
Interacting with Sourcery
-
Trigger a new review: Comment
@sourcery-ai reviewon the pull request. - Continue discussions: Reply directly to Sourcery's review comments.
-
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with
@sourcery-ai issueto create an issue from it. -
Generate a pull request title: Write
@sourcery-aianywhere in the pull request title to generate a title at any time. You can also comment@sourcery-ai titleon the pull request to (re-)generate the title at any time. -
Generate a pull request summary: Write
@sourcery-ai summaryanywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment@sourcery-ai summaryon the pull request to (re-)generate the summary at any time. -
Generate reviewer's guide: Comment
@sourcery-ai guideon the pull request to (re-)generate the reviewer's guide at any time. -
Resolve all Sourcery comments: Comment
@sourcery-ai resolveon the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore. -
Dismiss all Sourcery reviews: Comment
@sourcery-ai dismisson the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment@sourcery-ai reviewto trigger a new review!
Customizing Your Experience
Access your dashboard to:
- Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
- Change the review language.
- Add, remove or edit custom review instructions.
- Adjust other review settings.
Getting Help
- Contact our support team for questions or feedback.
- Visit our documentation for detailed guides and information.
- Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.
Wow, all this code only for parsing new license fields?
Wow, all this code only for parsing new license fields?
And we do not even do the SPDX parsing by ourselves but use packaging for that. However, there are many MUSTs and SHOULDs in the standard. This requires a lot of error handling. Apart from that I added (quite long) comments with extracts from the standard in order to understand possibly unintuitive parts of the implementation. And of course, more than half of the new code are tests.
I noticed, thanks for all the work!