pvlib-python icon indicating copy to clipboard operation
pvlib-python copied to clipboard

Add AI check box to PR template

Open AdamRJensen opened this issue 4 weeks ago • 12 comments

During the bi-monthly core team meeting on 2025-12-10, the topic of pull requests using AI was discussed. It was decided to add a checkbox to the PR template that the contributor must check. The check box should note that the contributor takes responsibility for the contribution and that any AI-generated content has been thoroughly vetted.

AdamRJensen avatar Dec 10 '25 15:12 AdamRJensen

Maybe this should move to to a discussion, but have the IP/copyright/license uncertainties around using AI for code been settled? Maybe they have, or everyone has just settled on it being too late, but does a contributor checking a box to say they take "responsibility" actually mean anything from a legal standpoint?

Fundamentally, using AI doesn't really change anything, but practically we know that the likelihood of using code from unknown sources goes way up.

williamhobbs avatar Dec 10 '25 16:12 williamhobbs

"Contributor takes responsibility for the contribution" has two meanings here:

  1. The copyright/licensing issue. AFAIK this remains to be settled. Realistically there is no way we will be able to keep pvlib's codebase free from AI output; in fact, it almost certainly already contains some. It's not clear what the right answer here is. But, it's also not what the checkbox is targeting.
  2. The code correctness issue. This is a lot clearer and what the checkbox is supposed to help with. Basically it means "If I (the contributor) used AI, I have checked its output myself and verified that the code is correct", i.e. that they have brought the code up to the same standards they would have if they wrote it the old-fashioned way. The point is to put in the contributor's brain that they are supposed to check the AI output and not just dump slop into a PR. Effective phrasing of the checkbox text TBD.

Some interesting discussion from other projects:

  • https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md
  • https://groups.google.com/g/sympy/c/GTh0-aveLtk
  • https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md

kandersolar avatar Dec 10 '25 16:12 kandersolar

Makes sense. I support that approach. Thanks for clarifying.

williamhobbs avatar Dec 10 '25 16:12 williamhobbs

I don't really care if a PR's code was written by LLMs or not. I do care if it's a useful algorithm, good code, and code that fits within existing pvlib python paradigms.

I doubt that people that are vibe coding their way to getting a pvlib contribution on their CV/github profile are going to be dissuaded by a checkbox. Nor do I think such a checkbox ultimately shifts the responsibility of the quality of the contribution from the maintainers (broadly the community) to the contributor.

So my primary concern is that our review efforts are just as good if not stronger in this LLM era. I do not want to see a drain on reviewer resources due to more and more low value PRs. I have clicked unsubscribe on a handful of recent PRs and this is only going to get more common. So I think we need to change our policy of reviewing every PR* to declaring the right to dismiss PRs without review or consideration if they look like slop - LLM generated or not!

* My reading of pull request reviews is that we will review every PR, implicitly with a goal of getting it into shape for merging. At least this has been how I viewed PR reviews from 2014 until fairly recently.

wholmgren avatar Dec 10 '25 20:12 wholmgren

I don't really care if a PR's code was written by LLMs or not. I do care if it's a useful algorithm, good code, and code that fits within existing pvlib python paradigms.

Copyright issues aside, since we don't have a conclusion there, I agree.

I doubt that people that are vibe coding their way to getting a pvlib contribution on their CV/github profile are going to be dissuaded by a checkbox. Nor do I think such a checkbox ultimately shifts the responsibility of the quality of the contribution from the maintainers (broadly the community) to the contributor.

I don't think the idea is to dissuade anyone from using AI, but at least just try to encourage contributors who do use any AI to check what they are actually contributing for accuracy and that it fits within the scope and context of what pvlib needs. This has always been the expectation on contributors, whether they write their own code, get something from stackoverflow, or now use AI. However, there does seem to be a trend of low-value AI PRs that clearly haven't been vetted or checked by the contributor in any way, so I think the effort here is at least to emphasise to contributors, hopefully encourage them, to check for themselves what they are contributing.

I agree it might be optimistic to think that a checkbox would help, but... maybe it's worth a try as an easy first step that might help.

So my primary concern is that our review efforts are just as good if not stronger in this LLM era. I do not want to see a drain on reviewer resources due to more and more low value PRs. I have clicked unsubscribe on a handful of recent PRs and this is only going to get more common. So I think we need to change our policy of reviewing every PR* to declaring the right to dismiss PRs without review or consideration if they look like slop - LLM generated or not!

  • My reading of pull request reviews is that we will review every PR, implicitly with a goal of getting it into shape for merging. At least this has been how I viewed PR reviews from 2014 until fairly recently.

Since we're not automatically asserting (as far as I understand) a link between AI-PRs and low quality, I think this point leans more in the direction of a separate discussion that took place where we did conclude that we need to do something about low-value PRs that may not originate from or be motivated by interests that are aligned with the pvlib mission.

The good first issue tag was removed, and I think everyone was opening to devising a strategy to dissuade one-time low-value PR contributions in future if they persist, e.g. requiring an intro from users elsewhere before a PR (I think the other repo doing something like this that was used an example might have been numpy?), and being open to closing PRs without review (as you suggested) and directing the contributor to new text on the docs page explaining what's needed before a PR.

@pvlib/pvlib-maintainer perhaps we need to publish a meeting notes summary somewhere?

RDaxini avatar Dec 10 '25 21:12 RDaxini

During the bi-monthly core team meeting on 2025-12-10

Am I still on the mailing list?

adriesse avatar Dec 10 '25 21:12 adriesse

Before seeing/replying to Will's comment, I had a change on my fork ready for a PR. Are we still going ahead with this or not? I am still in favour. I think we can always reverse it if it turns out to be ineffective.

On that note, the following is what I had for text: - [ ] I acknowledge that I am responsible for the content of my contribution and any AI-generated material has been vetted for accuracy and compatibility with the pvlib contributing guidelines

Or, shorter: - [ ] Any AI-generated material has been vetted for accuracy and compatibility with the pvlib contributing guidelines

Just first thoughts. Does anyone else have any suggestions?

RDaxini avatar Dec 10 '25 22:12 RDaxini

I want to highlight the following insights:

@wholmgren

So I think we need to change our policy of reviewing every PR* to declaring the right to dismiss PRs without review or consideration if they look like slop - LLM generated or not!

@RDaxini

we did conclude that we need to do something about low-value PRs that may not originate from or be motivated by interests that are aligned with the pvlib mission

Honestly, I think we are not doing most of those look-easy-tasty issues cause either we are freaking busy with our lives -blindly guessing some burnout levels too- or cause we are leaving those issues for newcomers when it comes to welcoming new potential mid- or long-term contributors.

And for @wholmgren

I doubt that people that are vibe coding their way to getting a pvlib contribution on their CV/github profile are going to be dissuaded by a checkbox.

^ this. My suggestion is to stop being welcoming.

a. No good first issues publicly available (glad to see you took that decision in the meeting). b. Admonition in the PR template: "scientific" contribs always welcome; "technical" ones for people that want to get involved a bit further (i.e., become a maintainer, even if it means for a short period of time. I can't help but think of my bachelor's dissertation case - I wouldn't expect them to stick around for too long); and "minor" or "good first issue" for people that want to get on board to later contribute any of the two former categories. c. You've opened a PR suspected of vibe coding? Close automatically. Assign a label and configure a bot to automatically explain or link to the rationale in a message (as stated in this SymPy's thread comment), that can further help the contributor - set expectations and goals. d. I'm also interested in, and seems to be a nice suggestion: @RDaxini

requiring an intro from users elsewhere before a PR (I think the other repo doing something like this that was used an example might have been numpy?)

e. Maybe removing us from GSoC lists - or by putting a big warning regarding that would help. It would help even more that GSoC organisers were concerned about that. However, it's difficult to think that a company that benefits from stupid vibe coders would take this kind of action. f. The AI check box being discussed right now could be an automated message triggered on PRs for users that have never contributed to the repo before. To allow for a bigger rationale without bloating honest and internal work.

The main drawback that I see is whether by being too harsh (unwelcoming), we stop the generational changeover. Luckily, I think pvlib and it's maintainers networks hold a good position regarding this.

My feeling from reading the previous conversation (and, to the relevant extent, @kandersolar relevant links) is that the copyright argument is a monkey patch over the ~upcoming~ current crisis of the unknowledgeable "educated". My argumentation against LLMs (even thou I'm a user, more for aux tasks than for real code writing in OSS) is a bit different, out of scope for this conversation. Enough of a rant for now.

Specific suggestions for the checkbox: - [ ] I acknowledge that I'm responsible for the quality of this pull request and I'm not using LLMs (Copilot, GPTs, etc.) or other tools without *fully understanding* it's output. - [ ] I acknowledge that using LLMs (Copilot, GPTs, etc.) is a violation of the code of conduct as long as their output is not deeply understood and reviewed by me, the contributor. Wouldn't it be great to update the code of conduct? #2619 Section "Encouraged Behaviors", points 2. and 4., CCoC v3.0. - [ ] I acknowledge that I'm responsible for easing the review process, and for that purpose I assume the task to understand and integrate the changes into the codebase. Reviewers are given the authority to close this pull request if there are signs of dishonest work. Suggesting LLMs or other AI-generated content without deeply reviewing it is completely forbidden.

echedey-ls avatar Dec 11 '25 00:12 echedey-ls

f. The AI check box being discussed right now could be an automated message triggered on PRs for users that have never contributed to the repo before. To allow for a bigger rationale without bloating honest and internal work.

Is this https://github.com/marketplace/actions/first-contribution the mechanism by which an automatic message would be generated, or did you have something else in mind?

RDaxini avatar Dec 11 '25 17:12 RDaxini

f. The AI check box being discussed right now could be an automated message triggered on PRs for users that have never contributed to the repo before. To allow for a bigger rationale without bloating honest and internal work.

Is this github.com/marketplace/actions/first-contribution the mechanism by which an automatic message would be generated, or did you have something else in mind?

Seems to do what I had in my mind, yes. Thx for taking the time to do that research ❣️

echedey-ls avatar Dec 11 '25 19:12 echedey-ls

So... now I am unsure how to proceed.

Original plan: add checkbox to PR template, decide the wording in the issue Seems like we don't have consensus on including a checkbox now (https://github.com/pvlib/pvlib-python/issues/2617#issuecomment-3638845387) Alternative is a auto-generated comment for new users (https://github.com/pvlib/pvlib-python/issues/2617#issuecomment-3643003140) Third, final alternative, is not to do anything.

So how should we proceed? Do we need to vote on these two (three?) options?

RDaxini avatar Dec 12 '25 19:12 RDaxini

I don't object to a checkbox. I don't think it's going to accomplish much but I don't think it hurts. I am also fine with an auto generated comment for new contributors.

wholmgren avatar Dec 12 '25 21:12 wholmgren

Thanks for clarifying @wholmgren. I didn't mean to mischaracterize your earlier comment

I've opened #2624 with the checkbox update. I like the idea of an autogenerated comment for new contributors too. I'd like to add that into the PR as well if no-one objects.

RDaxini avatar Dec 16 '25 04:12 RDaxini

It occurs to me that this is a fast-moving target, so we'll probably have to modify the approach in the near future regardless.

adriesse avatar Dec 16 '25 08:12 adriesse