Add a linter check for an email address in the "maintainer" field.
It would be nice to add a linter check for an email address in the "maintainer" field.
See https://github.com/ocaml/opam-repository/pull/26297#discussion_r1694016496
We have implemented and deployed this check, but, as per https://github.com/ocaml/opam-repository/pull/26581#issuecomment-2360205221, Marcello has some alternate concerns/requests about the linting logic. The propose to
relax the linter to avoid failing if at least one email is present (or even if a public repository is listed imo)
I don't think taking into account the visibility of git forges is viable. The repo must not only be only public, but also have issues (or discussions, or some other feature?) enabled, and IMO this kind of analysis of contributors development processes should be out of scope for a linting check. WDYT?
Regarding the suggestion that we "avoid failing if at least one email is present", this is any easy change to make, but I am not sure why it is worth relaxing the requirement here or introducing/allowing such variation in the opam data. The key question to my mind is:
What does it mean to be a maintainer of a package, if it does not mean committing to being contactable and responsive to users of the package? If we allow names in that list without any way of contacting them, what is the utility from a user's (or an opam-repo maintainers) perspective?
I am discovering this new check in a submission this morning, and just wanted to voice some general push-back against such check. If the project is on GitHub and has an issue tracker, for some projects I think it is reasonable for that to be the default way of interacting with the project maintainer. Don't you agree? I am not sure of the meaning of providing an email address for maintenance. As an individual, I surely wouldn't want to hint that I will read and/or respond to emails sent directly to me (rather than normal interaction via GitHub). Would you be OK to relax it somewhat?
I don't have very strong feelings on this, but I can see the logic of requiring a uniform, distributed, time-tested, and platform agnostic way of contacting maintainers. As such, I can offer some push back to your push back :smile:
If the project is on GitHub and has an issue tracker, for some projects I think it is reasonable for that to be the default way of interacting with the project maintainer. Don't you agree?
This is a reasonable way of submitting issues, but it's not necessarily the most reasonable way of contacting maintainers. Consider https://github.com/ocaml/opam-repository/issues/23789. The plans developed there currently entail setting up a process for archiving defunct and unmaintained packages. If we require that all maintained projects have an associated email address, then it is easy for us to automate (or semi-automate) outreach to maintainers when a package breaks. If, instead, we need to account for every possible bug-tracking system in use, we will have a maintenance nightmare and a technical quagmire.
I am not sure of the meaning of providing an email address for maintenance.
To deffer to wikipedia, "Email is a ubiquitous and very widely used communication medium; in current use, an email address is often treated as a basic and necessary part of many processes in business, commerce, government, education, entertainment, and other spheres of daily life in most countries." :wink: The meaning of providing an email address for maintenance is that the address may be sent messages relating to its maintenance, and it is expected that active maintainers would read and respond to these.
IMO, there quite a few good reasons that we should want to require maintainers to register a way of contacting them that is based on a uniform, time-tested, distributed, open standard. And I've only hinted at some of them above. Email seems like the best available alternative with these qualities.
All this said, I'm happy to work with all interested parties on figuring out a solution here that will satisfy all our needs and support a healthy, well maintained software ecosystem, without imposing unnecessary requirements.
(IMO, it is too bad that github does not appear to support creating issues based on messages sent to an email address (https://webapps.stackexchange.com/questions/76055/can-i-create-an-issue-in-a-github-repository-by-sending-an-email).)
As such, I can offer some push back to your push back
And it is very welcome, thank you!
I can't believe you linked me to the Email_address wiki page, you villain 😄 ! I'll try making it up to you in one of our next convos - of which I hope there'll be many in the future!
the address may be sent messages relating to its maintenance, and it is expected that active maintainers would read and respond to these.
That's what I feared, I guess. It sounds like accepting a service level, without a specific service level agreement. From experience, at least on GitHub, it feels to me like there's some kind of social expectation that people get to things when they get to things. Life happens, it's all in the open, sometimes other folks involved with projects jump in, etc. Some issues can stay unattended for weeks or more, and it can be OK. With private emails, there's a lot of this that you lose. And a chunk of context you won't necessarily have if join an existing project as maintainer.
Another concern I have is spam settings. I wouldn't be surprised if, with the current settings I have, there was a good chance for such maintenance emails coming from unknown senders to never make it to me.
That being said, thank you for sharing the perspective of the opam-repository maintainers, which I simply did not consider. Outreach to maintainers at scale doesn't sound like an easy problem. Thank you for your hard work!
Thinking a bit more about this, and in light of your use case, I think my reluctance may be due to the fact that I've been kind of procrastinating about setting up an email dedicated to my open source activity, and was kind of OK with the status quo that I had been able to get away so far without resolving this question. This new lint put this subject at the top of my stack. I think it's reasonable for me to solve this anyways. I'll think about it and let you guys work in peace. Perhaps using the plus addressing feature of a gmail free plan is a simple way to get started, as in "First Last <[email protected]>". I'll think about it. Thanks a lot, and good luck with this infra project!
Creating a better way to contact maintainers is an entirely laudable goal, but pushing a check into opam-repository without some design consultation seems a poor way to achieve this goal.
- how will the check be backported to existing packages?
- how do other package managers capture this metadata? is it always email addresses these days?
- how will the metadata be maintained? do we ping the email addresses every so often?
- we could consider other mechanisms to alert maintainers, such as using the discuss.ocaml.org ids to send messages to maintainers specifically about their opam packages. This works with groups and so on...
Overall though, I'd like us to not just push checks into the live opam-repo-ci without at least some consultation about what the checks are first...
Thank you for the considerations and directions for improvement, @avsm. I'm sorry for the delay in replying, but I was out sick last week.
Regarding opam repo policy and CI change managment
I'd like to start by addressing the point you raise around process:
Overall though, I'd like us to not just push checks into the live opam-repo-ci without at least some consultation about what the checks are first...
It looks like we may need to formalize an explicit process around developing the opam CI and packaging policies. We clearly moved forward without having achieved enough consensus or understanding from various stakeholders in this case, and having a clear, documented process should help us reduce the chance of that recurring. In this case, it may be a growing pain from my recent addition to the maintenance team?
That said, reviewing the (implicit, informal) process, there was certainly some consultation and there were some opportunities for interested parties to weigh in. I think we can improve this a lot tho. I have started some notes and it is on our agenda for the next opam repo maintainer meeting. Thank you for noting the problem!
Regarding the concrete design considerations
how will the check be backported to existing packages?
AFAIK, we already have a precedent for lint checks that are added without retroactively backporting to existing packages (e.g., to exclude pin-depends). It is also is common practice with the introduction of linting checks to incrementally improve the quality of the code base using lints without overhauling all existing code into conformance. To my mind, this can be approached as an incremental improvement on existing or updated packages and and existing packages grandfathered in.
This is not to say that it would not be worth while to consider backports. I just don't see why that would be a prerequisite here.
how do other package managers capture this metadata? is it always email addresses these days?
- npm requires registration that calls for an email for package authors. "package owners" (maintainers) must have an npm account.
- crates.io requires registration that requires a github account "crate owners" (maintainers) must have a crates.io accoun
- pypi requires registration that calls for an email. Their policy for contacting maintainers is via email.
- nuget requires registration that requires a Microsoft Account
- nix has a more fluid notion of package maintainers. It allows email, matrix ID, or github contact, but one of the three must be provided.
how will the metadata be maintained? do we ping the email addresses every so often?
This is a good question, and would apply to any form of contact I think. It can probably be bucketed along with a more general question of how we ensure home pages, issue trackers, or source archive URLs stay live? Adding something like this into the health check would be great.
we could consider other mechanisms to alert maintainers, such as using the discuss.ocaml.org ids to send messages to maintainers specifically about their opam packages. This works with groups and so on...
Indeed, we could consider any number of mechanisms. IMO, the main benefits of email are:
- It is a venerable, virtually universally used, open, free, distributed protocol.
- Anyone using the internet has at least one email address and accounts are easy to set up if additional emails are needed with no vendor lock in or walled gardens in he mix.
- It is straightforward to automate and semi-automate sending important information to email addresses.
- We don't need to ask contributors to register new accounts on any services they don't already use.
I'm not opposed to using discuss (or any other platform), but I cannot think of a more accessible or light weight option than letting people use an email address. I can imagine people or organizations being willing to maintain packages but not interested in registering for our forum or whatever. We could also support any number of different contact methods, but that seems like needless complexity and overhead for maintainers.
Looking forward to hearing any further thoughts on this!
We plan to discuss this at the next opam repo maintenance meeting. I think one key topic to decide there is whether we want to remove this check for now.
Email addresses are generally already available in the commit data of PRs into the repo, so at least for all cases where the maintainers are involved in package publication, we are only asking for the inclusion of data that is already being shared in the PR.
In the last opam maintainer meeting we arrived at this proposed change to the current lint:
The package must either list an issue tracker with a reachable URL, or at least 1 email must be included in the maintainer field.
Thoughts?