Should we have a policy on AI-submitted pull requests?
What's the problem this feature will solve?
Prompted by the discussion in https://github.com/pypa/pip/issues/13393, I'd like to suggest that pip adds an official policy on AI-generated pull requests. Note that I'm posting this in the tracker for transparency, but comments from anyone other than maintainers aren't going to be particularly helpful - if this issue degenerates into debates between non-maintainers over the merits (or otherwise) of AI for code generation, we're going to get nowhere and we'll probably have to take any discussions offline and make decisions in private. So I'd request that as far as possible, this discussion is limited to maintainers.
Describe the solution you'd like
Personally, I would rather we prohibit AI-generated code in PRs. Obviously, it won't always be possible to detect AI-generated code, so we'll be relying on contributors acting in good faith and respecting our policies, but I think that's a reasonable compromise.
My reasons for not wanting AI-generated code are:
- Copyright. I'm concerned that contributors submitting AI-generated code can't reasonably claim ownership of the code they are submitting (there have been documented cases of AI producing direct copies of code owned by other developers without permission), and therefore they can't give permission for pip to use that code under pip's license. This is something that could lie hidden in our codebase for some time before getting challenged, and fixing the issue could be problematic. Of course, copyright litigation of open source code is extremely rare in practice, so this is something of a theoretical problem in reality, but I think we owe it to our users and redistributors to take whatever steps we can to eliminate the risk.
- Correctness. When reviewing code, we assume at a minimum that the contributor has taken care to write code that addresses the subject of the PR, doesn't adversely affect other parts of the codebase, and in general, is correct. I don't personally feel the need to carefully check every line of code - while contributors can (and do!) make mistakes, we can assume they know what they were trying to do and made reasonable efforts to achieve that. With AI-generated code, those assumptions go out of the window. An AI can hallucinate APIs that don't exist, introduce security or correctness bugs by failing to respect the contract of APIs they call, etc. The problem here is not so much that incorrect code can be submitted (as I say, human contributors - and maintainers! - can do that too), but rather that there's no-one to discuss the code with. Unless the (human) contributor reviewed and understood the code to the point where they might as well have written it themselves without AI assistance, they aren't able to address basic questions like "what does this section of code mean?" As a reviewer, I don't want to be forced into the role of checking AI-generated code - that's not how I want to spend my time on this project.
- Maintainer effort. This follows on from point (2), but if the contributor is using AI to generate their PR, the process of giving feedback and getting that feedback incorporated into the PR becomes more difficult. Crafting prompts for LLMs is a skill, and review feedback is generally not intended as a prompt in that sense. So maintainers will need to do more work to ensure that their feedback is in a form that will be interpreted correctly, or they will have to go through multiple review cycles to get the changes they want. Overall, this will increase the maintainer effort required to review PRs, and maintainer resource is already in critically short supply.
Please note that I'm not suggesting we prohibit contributors from using AI as part of their process for developing a PR for pip. We can't (and shouldn't) try to dictate how people choose to develop code. What's important, though, is that the code actually submitted is all the contributor's own work, not copied from an AI prompt.
My specific suggestion for wording would be:
Pip does not accept AI-generated content for pull requests[^1]. Contributors are expected to submit only code which they have written themselves (and hence can assert ownership for purposes of copyright assignment). While contributors may use whatever tools they like, including AI assistants, when developing a pull request, it is the contributor's responsibility to ensure that submitted code meets the project requirements, and that they understand the submitted code well enough to respond to review comments.
(I'm tempted to say that we should explicitly note the risk that AI generated code may have been copied from non-open sources without permission, but I don't want our policies to get too "political", and this might be a step too far).
[^1]: I'm inclined to add "or issues" here, but my feelings on AI-generated issues are less strong.
Alternative Solutions
The obvious alternative is to simply not have a policy, and let maintainers decide for themselves. If that were the decision, I'd personally just not get involved with AI-generated content. But that approach wouldn't protect us from code with unclear ownership getting added to the codebase, nor would it give any clear guidance for contributors wanting to know whether AI-generated code is acceptable.
Additional context
A number of projects have already spoken out against AI-generated PRs (and issues). The most high profile one I'm aware of is curl (see https://www.linkedin.com/posts/danielstenberg_hackerone-curl-activity-7324820893862363136-glb1/ for the details).
Code of Conduct
- [x] I agree to follow the PSF Code of Conduct.
I would be happy to take some of your specific suggestion without the mention of AI, e.g.
While contributors may use whatever tools they like, when developing a pull request, it is the contributor's responsibility to ensure that submitted code meets the project requirements, and that they understand the submitted code well enough to respond to review comments.
But even without AI I find this line problematic:
Contributors are expected to submit only code which they have written themselves
Because we accept commits from bots (e.g. pre-commit bot, and dependabot) and expect contributors to run linting, which can modify code in a way they haven't "written themselves".
On the topic of AI generated code, I find banning it problematic for two main reasons:
First, copyright, correctness, and maintainer effort are not unique to AI, nothing stops a contributor from submitting poorly thought out code copied and pasted from multiple locations, and that they don't really understand.
Second, as a contributor it will be an unclear line between what is AI generated and what is AI assisted, and as a maintainer I have no way of distinguishing between the two other than assessing the quality of code and trustworthiness of the contributor.
I would rather have a policy that focuses on the problems of low quality submissions, that contributors who submit poor code, respond poorly to maintainer feedback, or write spammy issues may be blocked from submitting further contributions.
I am willing to change my mind on this though, if we get the kind of issues the curl project has received. But curl participates in a program that pays out money if security issues are accepted, so the incentive structure for submitting issues and contributions are quite different compared to pip.
As a remark, the uv leader is infering in a "tweet" of today June 9th 2025 that he is using Claude.
there have been documented cases of AI producing direct copies of code owned by other developers without permission
can you link to some examples? Last time I checked this alleged cases were specifically designed and prompted to copy code. Or pseudo-LLM models constructed in way specifically to copy text and pretend it was LLM generated to obfuscate copyright violations.
(I've hidden an off-topic comment about Charlie's tweets)
Copyright. I'm concerned that contributors submitting AI-generated code can't reasonably claim ownership of the code they are submitting (there have been documented cases of AI producing direct copies of code owned by other developers without permission), and therefore they can't give permission for pip to use that code under pip's license.
This is something we can ask for the PSF Legal to give us specific guidance on. There is a version of this where they come back asking for us to require a CLA or DCO to deal with risks of such liabilities.
FWIW, one of the lightning talks at the language summit at this year's PyCon US was a core developer encouraging other core developers to use these LLM-assistance tools more proactively.
I'm inclined to add "or issues" here, but my feelings on AI-generated issues are less strong.
I feel like they're both equally bad in terms of what we'd need to do -- it'd be OK for us to just mark them as "spam" and move on. In either form, this is effectively AI slop and we should treat it as such.
I would be happy to take some of your specific suggestion without the mention of AI, e.g.
While contributors may use whatever tools they like, when developing a pull request, it is the contributor's responsibility to ensure that submitted code meets the project requirements, and that they understand the submitted code well enough to respond to review comments.
I prefer this language more (for reasons already outlined excellently). Some part of me wants to add something to the tune of "We will mark LLM-generated slop as spam without additional discussion." at the end there. On general, not spending maintainer energy on low-effort / low-quality contributions by saying "this is not welcome here" is sufficient IMO.
I'd leave 👍 but to be explicit: I agree on all points from @notatallshaw and @pradyunsg (and I would be totally fine with adding a "We will mark LLM-generated slop as spam without additional discussion" note).