Infrastructure/DevOps needs attention
Updated diagram:

Spread too thin. Single point of failures. Not prioritized enough in paid maintenance work. Need to be recognized properly on astropy.org/team page. This goes beyond the termed SOSS hire.

p.s. Diagram made with https://app.diagrams.net/ . I have the .drawio file and willing to provide it if you have a place to version control such things.
It used to be much simpler. It's not just the people who develop the stuff, it's also complicated for contributors. I know its a piece of cake for people like @pllim , but I have not even managed to set up all the tools to run tox in my environment yet (after years of advertising conda, I'm no using conda and it does not play really nice with tox). Instead, I rely on the CI tests again, when I do an occasional code contribution or just decide not to contribute code to astropy (i.e. I write a local script for my problem, but don't turn it into a PR).
The complexity for developers is not something that we can document-away: It's changing frequently and developers have to know something about what's going on behind the scenes because things do fail (e.g. a test does not pass, CI acts up etc.). That makes the contributor-pipeline dry up.
The people this setup is made for are pure users. We develop a stable, highly tested package. They install and it all works.
In the past, the answer to every problem has been to layer on more technology (e.g. add a pytest module, add a bot, add more CI). I wonder if we need to dial back some of that for simplicity. At least astropy-helpers is going away...
It used to be much simpler.
I am not sure how true this is. It's definitely different now, but I remember my first few astropy contributions and the tooling has never been what I would consider "simple". (setup.py test was something I needed to learn as I had never seen it anywhere else etc.)
[conda] does not play really nice with tox
I would be very interested in the specifics of this, although here isn't the right place. tox shouldn't care how you installed python. I have used it happily with multiple system Pythons, venvs and conda (a little).
I look at this diagram and I can't see anything which is easy to remove, all the things here are doing something, and from my experience with APE 17, changing any of those things is likely to be hard and someone is likely relying on some part of it you didn't really think about. (Or Erik is refusing to use venvs.)
The main goal of APE 17 for me was to make the components simpler, and to increase our use of tools we aren't maintaining and are more common around the wider ecosystem.
@hamogu - We have a complex library ecosystem and users expect a stable package. this complexity is not unusual in the scientific python ecosystem.
And the contributor pipeline is not drying up, there are 20-30 new people in each releases. It's not a requirement to understand each piece listed here to make contributions, but accept that there is a price to pay for stability, and hacking things together either for features or infra means to pay the technical debt later. But it's time that debt not to be kept passed on people who keep this and the downstream integration afloat, but to start value them and their contributions and expertise.
@pllim - the wheels and conda repos have been left out, at least the daily wheel branch should be part of this picture
@bsipocz , I intentionally did not include the release stuff (they are all implied under "release extravaganza") because the release stuff deserves a chart on its own, but I have never done an astropy release before, so if I make that chart, I won't do it justice.
@hamogu , I agree with others that removing the pieces will not be trivial if not impossible. This chart is meant to increase their visibility in case we want to pay more people for maintenance, but not to say that we should get rid of them. For instance, even moving pytest-doctestplus to pytest proper would need a dedicated person (if not more) to make it happen, see astropy/pytest-doctestplus#100 .
Need to be recognized properly on astropy.org/team page
I'm certainly open to this, but can we be a bit more specific about what roles are actually missing? (note that for this question I'm thinking of roles as "who's the expert in that" rather than "credit") Perhaps somewhat ironically, the web page maintainer and "people-wrangling infrastructure" is not on this diagram, and there's a very non-trivial level of effort involved in that which trades off with the need to be fine-grained in the list of roles. To be clear, I'm not saying we don't need more, it's just not clear to me what on this diagram is not covered in the team page (or needs more breaking-out into sub-roles).
Re: @bsipocz
But it's time that debt not to be kept passed on people who keep this and the downstream integration afloat, but to start value them and their contributions and expertise.
I'm going to say this as clearly as possible: I at least do value said people. Thank you for your work.
@eteq - don't you feel the contradiction between your last two comments? If after all these discussion you still don't see what roles are missing from the page, and what roles are populated with people who didn't do said roles for years, then I give it up, now for real.
And, I see both astropy.github.com and well as astropy-data on the diagram. That the webpage maintenance is not happening is not the fault of the diagram but the procedure of how things are run. There has been people outside out the usual North American astronomy network who started to contribute significantly to the webpage, yet it wasn't good enough effort or do-ocracy at work for those who are allowed to select the maintainers.
re: @bsipocz
don't you feel the contradiction between your last two comments? If after all these discussion you still don't see what roles are missing from the page, and what roles are populated with people who didn't do said roles for years, then I give it up, now for real.
I'm sorry for that. I do not see a contradiction here because in the first comment I am asking, no begging for actionable help. If you want me to re-phrase, how about: "What roles do you want added, please tell me." I don't want to say anything anyone asks for gets added without question, because we have to maintain it, and you frequently point out that the page is not being kept up-to-date, and I don't want to make that problem worse. But that is not a statement that I'm refusing to add roles. So far, to be honest, I've agreed to just about everything you've suggested on the roles page even against my differing perspective, so I hope that should serve as a sign that I value your opinion, @bsipocz .
And, I see both astropy.github.com and well as astropy-data on the diagram. That the webpage maintenance is not happening is not the fault of the diagram but the procedure of how things are run.
You are right! I was looking for "web page", but you're right that this covers that. I do want to highlight the "people-warngling" element that does not appear here still, but you're right that I mis-read the diagram on the web page front. My mistake.
There has been people outside out the usual North American astronomy network who started to contribute significantly to the webpage, yet it wasn't good enough effort or do-ocracy at work for those who are allowed to select the maintainers.
Please tell me who you are thinking of, either here or elsewhere! If you have someone to nominate, I'm happy to hear it. If you want to post the nomination here or to astropy-dev if you think I'm ignoring something, that's fine too. I don't know of any instance where I intentionally have done this, but it certainly could have happened accidentally (see my comment above about being underwater).
I do want to highlight the "people-warngling" element
This is an infrastructure layout diagram, so no wonder no community engagement and people management is not listed on it.
(note that for this question I'm thinking of roles as "who's the expert in that" rather than "credit")
And the team page I think should be populated with the people who are when pinged about issues falling under the give role has some non zero chance to actually will do something, not a random collection of people who might be able to figure out how to do the given role if they really have to. That also means track record. I suspect this is where we fundamentally disagree.
"What roles do you want added, please tell me."
Creating roles is actually CoCo realm. Those realms kept extremely closely, and ever more so in the past year, and you went into great lengths to explain and emphasize that people outside that group is not part of leadership, proposals, and recognition. So you can't really have both ways when things are gate kept.
Anyway, more than over a year I've been asking about a generalist maintainer role. One that covers the every day interaction e.g. Pey Lian is doing. I feel it's totally unfair to have her as utils maintainer only while the work she does is way more crucial for the every day operation of the project. I clearly remember to start explicitly asking about this last summer, then definitely at the NF summit, and again at the coordination meeting. Nothing happened besides being told that I should care less about the project. I clearly said it at many places, too that a generalist role, that would fit people like Stuart, Derek, or Anne would be nice. A role that is not tied to a subpackage, as those are maintained by their "owners", rather than to have a way to step up without stepping on toes for people who are willing and able to jump in to multiple places to fix things (see e.g. Derek and the C codes). These people in many cases are also much more proactive to fix issues than the subpackage owners, let alone may other roles, and I find it absolute unfair to list one but not the other on a page that is used to distribute credit and money.
If there is disagreement that these roles are not important on the level of how fine grained other parts of the team are then raise those disagreements, because so far the only reaction I got was agreement and then silence.
Then also a clear distinction that who are core dev and who aren't. I would love to see the minimal recognition of people who are actually building astropy (e.g. Nadia, Simon, Pey Lian, etc) and know about the technical details to have the chance to be invited to events where the expectation is to talk about those details. Currently everything is kept artificially obscure on this topic.
I think the idea of an overall project maintainer's role is a great one! I apologize if I've forgotten any convo we may have had about it in the past, but I'm pretty sure this is the first time I'm seeing it in writing.
I think this lack of documentation might be contributing to the frustration you describe as "agreement and then silence." I think it's super important (and difficult) to record agreed upon action items somewhere and we haven't been very good at it. If there are other things that you can recall, where there was agreement, but no action was taken, please open issues reminding all of us of those things.
Tom A's credit page is very much in line with this and got some motivation from the clear discrepancies between realities and representations on the team page. That was in writing, and it may also been in writing in various meeting minutes and slack convos, I certainly raised this to multiple people multiple times. (nevertheless, nicely played card) The credit page got ditched however, even though it received only positive feedback, in writing, out in the open. 🤷
Also, it's very sad that people has to very vocally complain, several times, for the CoCo to recognize who is working on what and that there are missing roles that need to be acknowledged, and that the fragmentations of roles on the page is extremely disproportional to the efforts put into the roles, or the efforts needed to keep the project going.
The credit page got ditched however
I don't think that's true - as far as I'm aware we are still keen on it but it will need to wait for the next 'roles' survey since we'll be asking questions about yearly and lifetime contributions in there?
My last info was that it's being ditched, but it came the scenic route so who knows where it originated from. Either way, it was just an illustration that it's not fair to say the topic of the lack of recognition of certain roles wasn't said enough times, or it wasn't written down.
On the topic of a generalist maintainer, I may have contributed to a misunderstanding, in which case I'm sorry: I personally am not in favor of a generalist maintainer role without some specific focus areas because I'm concerned that it leads us further down a trap that we are currently in: that the "hard" PRs don't get reviewed soon enough because the experts don't have the bandwidth (something @bsipocz has rightly pointed out several times) - but that does not mean I'm against there being any generalist maintainer role. I'm also not sure whether the person in this role should get write powers vs triage powers, so that was a detail that needed working out, and perhaps that thread got lost.
My last info was that it's being ditched, but it came the scenic route so who knows where it originated from. Either way, it was just an illustration that it's not fair to say the topic of the lack of recognition of certain roles wasn't said enough times, or it wasn't written down.
The last I recall is that it got positive feedback on the mailing list, and we also discussed it positively in a CoCo meeting. Then in the Governance WG we discussed it as a way to start the voting members list, but the majority of the group didn't like it for that purpose. So it was tabled as not something that group need right then. I agree it's a good idea, though, and in an effort to be more open about all of this I just made an issue for that just now (#135).
Also I want to mention, since it's gotten a bit lost in the thread: this diagram by @pllim is really useful and illuminating in the first place!
Oh, yes, I remember that meeting, that was the one where ask for a schedule reshuffle was denied, even though it was accommodated for each and everyone else in previous meetings! The notes about the majority though: "Nowhere near unanimous….(3 or 4 yes, 3 or 4 nos, two or more coffee cups)"
But anyway, the point here was never about the voting members, but that there is a team webpage that is known to be inaccurate, at least a few of us feels that way but the way it was handled it to over silence these complaints and now come up with all kinds of diversions and bike sheds after seemingly there was an agreement that something has to be done. "It wasn't in written form", "does it come with commit right". I don't talk about roles to be filled with 20 people who now suddenly all need write access. Frankly we talk about to give the people recognition and ownership who spent the most time with the project, shepherding it, and who are the most comfortable with the whole workflow.
Then a generalist maintainer role would remove the current gate keeping around core maintainership that means only subpackage maintenance. Frankly, again, we talk about people who know the workflow as they would get into this category because of track record of their willingness to step up and fix issues (see the names above for examples), rather than the current practice of naming people sub package developers even before they had a single PR to core. I know the two of us disagree on this, I would prefer the first approach, you do the second, but I would like for the rest of the dev team to chime in. This of course needs a discussion, but there is no platform for that as the dev telecons ceased to exist.
Maintainer mentoring was a big action item on the plans for this year, and I fail to see anything is happening (even before covid) rather than reshuffling priorities by executing the executive power of the coco, let it be scope and timing of releases to focus of jobs.
Also, I fail to see how adding more people into the mix, in a role that they either practically fill for years, and a brand new role that can be part of the workflow to become more involved in the project prohibit the subpackage maintainers to do their role and review the big PRs (and close the ones that are apparently not to their liking, but kept open for years). Even the practice of adding new subpackage maintainers haven't solved much of the bottlenecks when they haven't given training/mentoring about the workflow and the original ones still want to see, review and approve everything.
Maintainer mentoring was a big action item on the plans for this year, and I fail to see anything is happening
Even without covid, the main priority which was a necessary condition before setting up a mentoring scheme was to figure out how to pay project members (one of the primary goals of the grant), and this has been one of the main priorities of the first half of the year. We need this because we need to be able to optionally pay people to be mentors as well as optionally pay people to run the scheme (since despite the discussions about this at several coordination meetings no one has volunteered to try and take this on - any project member can take initiative to get things started but that hasn't happened). Now that paying people is well underway and that the governance discussions are entering a new phase, I think it will be easier to start planning a mentoring scheme and we can make it a priority of the coming academic year. I'm personally very interested in making that happen.
I made a specific issue (#136) to discuss the proposal of a generalist maintainer role. As I mentioned there, I will also crosspost that issue to the several forums which do indeed exist for discussion. (If there are any other ideas which anyone thinks warrants further discussion, please feel free to open an issue here, post to the astropy-maintainers or astropy-dev email list, or start a convo in the slack #project channel.)
To broadly address some of the things which Brigitta has brought up, I have two comments:
In any organization, just because a problem is known, doesn't mean the solution to that problem is known, nor that the staffing/motivation is there to implement any possible solution. I think it's critically important in any setting, but even more so in an organization like this one, to include possible actionable solutions alongside any discussion of things which aren't working well. Just continually pointing out a problem is insufficient to getting that problem that solved. (And often, is actually the least effective way of getting that problem solved.)
Second, advocates and a "shepherd" or two helping to move things along are absolutely crucial in order to get changes and new things implemented in an organization where the folks involved in that organization are already stretched thin. This is not unique to this project, it's pretty universal. Just because folks agree that something is a good idea and would do something about it in an ideal world, doesn't mean it will get done. Documenting, nudging, nagging, or oftentimes, just taking the lead and doing it yourself, are all often required to get change implemented.
Documenting, nudging, nagging, or oftentimes, just taking the lead and doing it yourself, are all often required to get change implemented.
Because that especially worked out so well for me, taking on more and more over the years didn't change anything. do-ocracy only works here is none of the 4 of you fancy vetoing it, even if there was a wider consensus. In fact accesses that I asked for years (yes, there is written trace) are denied, e.g. the access to the team settings. Rather the coco keeps that with with them, so each time we need to go around and ask (often multiple times) for people being added to teams to be able to assign them to issues. The release procedure and devops is similarly full of workarounds, I yet to hear about another project where the people doing those tasks have no access to webhooks, integrations and are not able to follow their own release procedure as it was written with access rights in mind. So that the coco is stretched thin is partly due to them unnecessarily playing micromanagement and BDFL. On the other hand there is time to explain how much I'm not part of the project, and that the roles I'm involved in are the bottleneck for the development.
Coming back to this / reminded of this by @pllim at the coordination meeting: I still think this chart is useful! If it doesn't make sense to appear on the website, can we put it into this repo at least?
Also... https://xkcd.com/2347/
Wow, it's been 3 years. Before the coordination meeting this year, I will try to:
- Update this chart.
- Upload this new version to existing Zenodo.
- Just put it in this project repo.
I have updated this chart (see original post) and also officially added it to https://github.com/astropy/astropy-project/tree/main/infrastructure . I will keep this issue open because the topic is still true.
I just revisited this for the Astropy Coordination Meeting 2024. I think the situation now is less dire than when I opened this issue.
Now that we have a roadmap and a dedicated section (https://github.com/astropy/astropy-project/blob/main/roadmap/roadmap.md#infrastructure-documentation), I think I can close this issue. Going forward, such things should be raised as a roadmap issue, perhaps.
Thanks, all!