warehouse
warehouse copied to clipboard
Add package uploader/maintainers to the Package metadata API
What's the problem this feature will solve?
Help identify trustworthy package uploaders. Currently, the package metadata API https://pypi.org/project/{package_name}/json returns the repository maintainers, but not the package maintainers. Accessing package uploader/maintainer can help build credibility to the package or expose risks.
Describe the solution you'd like
Package maintainer is added to the API. If the package maintainers' historic contributions could be added to this or a separate API, that would help identify trustworthy packages.
Additional context
Home-brewed or forked packages, which should not inherit credibility, such as https://pypi.org/project/f-ask/. This package at a glance (incorrectly) looks to be owned by the pallets team, which has a different level of trust associated with it. This was just an example, please do not negatively affect whoever uploaded it. I do not wish to check if it was a malicious typo-squat or not, as that is irrelevant to the problem to fix.
I second @Duppils, though my motivation for wanting this information is to disambiguate PyPI user with GitHub user accounts, Wikidata entries, and ORCID identifiers so we in the computational life/natural sciences (and others) can better report on bibliometrics of software
What code needs to be changed
The following code is responsible for what gets put on the metadata API (https://pypi.org/project/{package_name}/json). The trick is just to connect the Project model in the database to the associated information, then do some sqlalchemy magic (i.e., joining + filtering) to get it out
https://github.com/pypi/warehouse/blob/186180c600da5a99a7bdc0da548f852351d91e84/warehouse/legacy/api/json.py#L63-L186
Investigation
This appears to be the enum responsible for people's roles in a package:
https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L513-L516
This enum appears in the following database model:
https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L519-L537
This is linked in a secondary table to the Team model in https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L610-L612
Proposal
Given the release object in the JSON API, you can (probably) traverse the datamodel to get a list of maintainers with the following:
maintainer_to_roles = defaultdict(list)
maintainers = {}
for tpr in release.project.team_project_roles:
role_name = tpr.role_name
for user in tpr.team.members:
maintainers[user.username] = user
maintainer_to_roles[user.username].append(role_name)
maintainers = [
{
"username": username,
"name": user.name,
"roles": sorted(maintainer_to_roles[username])
}
for username, user in maintainers.items()
]
@cthoyt Thanks for the investigation! Would you consider turning this into a pull request? Our dev docs should help get you going. https://warehouse.pypa.io/
@miketheman yes, in fact I already have the code ready :) will post it later
@cthoyt Did you work any more on implementing this? I had a look today and maybe the code from the package html view could be used to load maintainers?
https://github.com/pypi/warehouse/blob/30846f6061d63b8c41a3275d1958efe3749524d8/warehouse/packaging/views.py#L97-L108