warehouse icon indicating copy to clipboard operation
warehouse copied to clipboard

Add package uploader/maintainers to the Package metadata API

Open Duppils opened this issue 4 years ago • 4 comments

What's the problem this feature will solve?

Help identify trustworthy package uploaders. Currently, the package metadata API https://pypi.org/project/{package_name}/json returns the repository maintainers, but not the package maintainers. Accessing package uploader/maintainer can help build credibility to the package or expose risks.

Describe the solution you'd like

Package maintainer is added to the API. If the package maintainers' historic contributions could be added to this or a separate API, that would help identify trustworthy packages.

Additional context

Home-brewed or forked packages, which should not inherit credibility, such as https://pypi.org/project/f-ask/. This package at a glance (incorrectly) looks to be owned by the pallets team, which has a different level of trust associated with it. This was just an example, please do not negatively affect whoever uploaded it. I do not wish to check if it was a malicious typo-squat or not, as that is irrelevant to the problem to fix.

Duppils avatar Aug 31 '21 12:08 Duppils

I second @Duppils, though my motivation for wanting this information is to disambiguate PyPI user with GitHub user accounts, Wikidata entries, and ORCID identifiers so we in the computational life/natural sciences (and others) can better report on bibliometrics of software

What code needs to be changed

The following code is responsible for what gets put on the metadata API (https://pypi.org/project/{package_name}/json). The trick is just to connect the Project model in the database to the associated information, then do some sqlalchemy magic (i.e., joining + filtering) to get it out

https://github.com/pypi/warehouse/blob/186180c600da5a99a7bdc0da548f852351d91e84/warehouse/legacy/api/json.py#L63-L186

Investigation

This appears to be the enum responsible for people's roles in a package:

https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L513-L516

This enum appears in the following database model:

https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L519-L537

This is linked in a secondary table to the Team model in https://github.com/pypi/warehouse/blob/80d136a075918b6b451b542f61b4936fa7e48b20/warehouse/organizations/models.py#L610-L612

Proposal

Given the release object in the JSON API, you can (probably) traverse the datamodel to get a list of maintainers with the following:

maintainer_to_roles = defaultdict(list)
maintainers = {}
for tpr in release.project.team_project_roles:
	role_name = tpr.role_name
	for user in tpr.team.members:
		maintainers[user.username] = user
		maintainer_to_roles[user.username].append(role_name)
maintainers = [
	{
		"username": username,
		"name": user.name,
		"roles": sorted(maintainer_to_roles[username])
	}
	for username, user in maintainers.items()
]

cthoyt avatar Feb 26 '23 13:02 cthoyt

@cthoyt Thanks for the investigation! Would you consider turning this into a pull request? Our dev docs should help get you going. https://warehouse.pypa.io/

miketheman avatar May 23 '23 09:05 miketheman

@miketheman yes, in fact I already have the code ready :) will post it later

cthoyt avatar May 23 '23 09:05 cthoyt

@cthoyt Did you work any more on implementing this? I had a look today and maybe the code from the package html view could be used to load maintainers?

https://github.com/pypi/warehouse/blob/30846f6061d63b8c41a3275d1958efe3749524d8/warehouse/packaging/views.py#L97-L108

peterk avatar Sep 22 '24 19:09 peterk