setuptools Governance and Change Management Review After Breaking Change in #4870

The recent PR #4870, authored by @abravalheri and approved by @jaraco, introduced a breaking change that immediately and broadly disrupted the Python ecosystem by disallowing deprecated dash-separated and uppercase options in setup.cfg. This was merged despite clear acknowledgment that it would break downstream projects:

"if we are feeling brave. But it can break packages that have not addressed the warning." — @abravalheri

"I'm inclined to say we should do it, even though it will cause some disruption." — @jaraco

This change effectively broke builds across the ecosystem with no grace period, runtime fallback, or migration tooling—blocking unrelated emergency patches and releases in critical production systems. The follow-up PR #4909 that removed requests from integration tests—because this change broke internal usage—further illustrates how unprepared the project itself was for the consequences of this decision.

Bug #4910 showed just how widespread the issue was. After widespread backlash, the change was reluctantly reverted in #4911, but when asked if an opt-out would be provided for a future reintroduction, the author responded:

“The option is to pin setuptools via the PIP_CONSTRAINT environment variable... All things considered this is likely a symptom of something bigger, more problematic, and the package you depend upon has been on borrowed time for a while... Organise among other users contributions (or even a fork) for maintenance. Fund the original developers for maintenance. Find a suitable replacement.” — @abravalheri

This dismissive response shifts responsibility entirely to users and downstream projects without acknowledging the maintainers’ own role in ecosystem stability. It shows a fundamental lack of empathy for maintainers and users who depend on stability and predictable deprecation cycles.

This raises serious questions about:

Change management and deprecation enforcement policy: Why was a breaking change merged without community consensus or a proper deprecation window?
Contributor and reviewer accountability: How are high-impact decisions vetted, especially those with acknowledged ecosystem-wide fallout?
Branch protection and CI governance: How did this change pass review without ecosystem impact validation or internal test failures acting as a safeguard?

We urgently request the following:

A postmortem detailing what went wrong and how this will be prevented in the future.
Clear documentation of your deprecation and backwards-compatibility policy.
Transparency around contributor/reviewer permissions and decision-making authority.
A published governance or escalation model for controversial or high-impact changes.

The Python ecosystem relies heavily on setuptools, and changes of this magnitude cannot be driven by personal opinion or experimentation—especially when knowingly disruptive.

This incident has eroded trust. Rebuilding that trust will require serious introspection and reform.

Mar 25 '25 14:03 ianballard

To clarify, are you offering resources in the form of long term support or funding to setuptools to support your urgent request of these seemingly resource intensive corporate-like governance structures?

Separately, while I do think there are open questions about build backends being able to make backwards compatible changes, I also think this is an opportunity for users to self reflect on:

Why are you building sdists and not installing wheels?
Why do you not have an internal cache (artifactory, docker, etc.) to protect you from external build infrastructure changes?
For your own projects why are you depending on setuptools and not a more modern build system? (e.g. flit, hatchling, PDM, poetry_core, and soon uv_build).

Mar 25 '25 14:03 notatallshaw

To clarify, are you offering resources in the form of long term support or funding to setuptools to support your urgent request of these seemingly resource intensive corporate-like governance structures?

Separately, while I do think there are open questions about build backends being able to make backwards compatible changes, I also think this is an opportunity for users to self reflect on:

Why are you building sdists and not installing wheels?

Why do you not have an internal cache (artifactory, docker, etc.) to protect you from external build infrastructure changes?

For your own projects why are you depending on setuptools and note a more modern build system? (e.g. flit, hatchling, PDM, poetry_core, and soon uv_build).

@notatallshaw

Respectfully, this response misses the point.

Setuptools is a foundational project with deep, transitive integration across the Python ecosystem. The concern raised is not about the technical direction of the project, but about process, governance, and the way this specific breaking change was introduced:

No migration tooling or opt-out path was provided.
No deprecation period or clear policy was referenced.
The change broke the project’s own tests and many others in the ecosystem.
The tone of communication from core contributors was dismissive and unprofessional when concerns were raised.

This is about the standard of care expected from maintainers of critical infrastructure.

To your question: if the setuptools team lacks the resources to implement safeguards before merging breaking changes, that is a valid and important discussion to have. But the right response to limited resources is caution, not recklessness.

If you're open to support, I’d be happy to help facilitate discussion with Python community stakeholders who are willing to support long-term stability—because what's at risk here is not just a broken build, but trust in the reliability of Python's core tooling.

Let’s not turn this into a false binary between modernization and accountability. We can and must have both.

Mar 25 '25 14:03 ianballard

More data points:

a week ago setuptools 77.0.1 broke several packages that use a non-standard layout which is commonly used in monorepos and projects where Python is just one of many language bindings to a C/C++ library, #4892
last year setuptools 72.0.0 broke hundreds of packages that were relying on setup.py test and setuptools.command.test, #4519

Why are you building sdists and not installing wheels? Why do you not have an internal cache (artifactory, docker, etc.) to protect you from external build infrastructure changes?

Linux distributions and some companies do not rely on wheels from PyPI. Instead they are rebuilding all packages from sources and store the sources + wheels in an internal package index. I detected bugs #4892 and #4910 within an hour of the setuptools release, because latest setuptools release broke the test instance of our build pipeline that rebuilds all our wheels from sdists. We perform regular tests to ensure that we can build and rebuilds latest versions.

If you are interested to learn more about our tool chain: https://github.com/python-wheel-build/fromager

Mar 25 '25 14:03 tiran

To clarify, are you offering resources in the form of long term support or funding to setuptools to support your urgent request of these seemingly resource intensive corporate-like governance structures?

Separately, while I do think there are open questions about build backends being able to make backwards compatible changes, I also think this is an opportunity for users to self reflect on:

Why are you building sdists and not installing wheels?

Why do you not have an internal cache (artifactory, docker, etc.) to protect you from external build infrastructure changes?

For your own projects why are you depending on setuptools and note a more modern build system? (flit, hatchling, PDM, and soon uv_build).

Thank you for the opportunity to self-reflect, the whole community is forever grateful! - on a more serious note, I believe this is exactly what the author meant when he said "dismissive attitude". But to take things one at a time:

As others have pointed out, for instance Linux distributions and downstream vendors rebuild wheels from sources. Linux distributions like Debian, Fedora, and Gentoo do not use wheels. They take sdists and rebuild wheels.
While people have an internal cache of versions, this has nothing to do with the fact that a change was introduced that broke everything
Are you essentially telling people to introspect on why we use setuptools at all? This comes across as essentially advocating for people not to use this tool at all. Is that really the intention?

Mar 25 '25 14:03 SamuZad

@ianballard @SamuZad To be clear I am not a setuptools maintainer, so please don't take my replies as such.

@ianballard

Respectfully, this response misses the point.

Setuptools is a foundational project with deep, transitive integration across the Python ecosystem. The concern raised is not about the technical direction of the project, but about process, governance, and the way this specific breaking change was introduced:

No migration tooling or opt-out path was provided.

No deprecation period or clear policy was referenced.

The change broke the project’s own tests and many others in the ecosystem.

The tone of communication from core contributors was dismissive and unprofessional when concerns were raised.

Besides the last point, in which I can't speak for setuptools, I do largely discuss the issues of build backends making breaking changes in the post I linked to. Edit: Fixed hyperlink

To your question: if the setuptools team lacks the resources to implement safeguards before merging breaking changes, that is a valid and important discussion to have. But the right response to limited resources is caution, not recklessness.

Yes, most Python packaging foundational projects are severely under resourced, that's why I was asking you to clarify if you are going to provide funding for making these urgent requests.

@SamuZad

As others have pointed out, for instance Linux distributions and downstream vendors rebuild wheels from sources. Linux distributions like Debian, Fedora, and Gentoo do not use wheels. They take sdists and rebuild wheels.

Sure, but the user reports did not sound like Linux distros, who often vendor their own version of setuptools and have custom build scripts, nor typically use a pip or python front end tools to build. Are you a Linunx distro that had this problem yesterday? Why did you have these problems? Are you not vendoring your own version of setuptools?

While people have an internal cache of versions, this has nothing to do with the fact that a change was introduced that broke everything

Wheels weren't broken, which is what most users are expected to use to install Python packages. It is still unclear to me why most users reporting the situation were depending on building sdists, and this should be understood as part of this “post-mortem”, we should be encouraging users to move to wheels.

Are you essentially telling people to introspect on why we use setuptools at all? This comes across as essentially advocating for people not to use this tool at all. Is that really the intention?

Remember, I'm not a setuptools maintainer, but I am invested in Python packaging, so this is "just my opinion", but yes, I would strongly recommend looking at using flit or hatchling instead of setuptools. I have already done so for all my work projects last year.

Mar 25 '25 14:03 notatallshaw

I am flabbergasted to why v78.0.1 still have not been yanked from pypi.

https://pypi.org/project/setuptools/#history

For much less impact v75.9.0 was yanked and yet the problematic v78 survives.

There is zero reason they should have kept it during the abrupt event and even more so now the feature was rolled back

Mar 25 '25 14:03 inoa-jboliveira

Sure, but the user reports did not sound like Linux distros, who often vendor their own version of setuptools and have custom build scripts, nor typically use a pip or python front end tools to build. Are you a Linunx distro that had this problem yesterday? Why did you have these problems? Are you not vendoring your own version of setuptools?

I'm working on a Linux distro. The test instance of our build pipeline was affected by two breaking changes in setuptools in the past 7 days. Our test pipeline builds with the latest version of all our dependencies to detect these kinds of breaking changes. See https://github.com/pypa/setuptools/issues/4919#issuecomment-2751481371

Mar 25 '25 14:03 tiran

I'm working on a Linux distro. The test instance of our build pipeline was affected by two breaking changes in setuptools in the past 7 days. Our test pipeline builds with the latest version of all our dependencies to detect these kinds of breaking changes. See #4919 (comment)

Thanks for the info, I read your earlier comment but I found something a little ambiguous, do you use the latest setuptools in a test pipeline to validate everything works? Or is it part of the real build pipeline and this caused the Linux distro to be unable to publish new packages?

Any post-mortem would need to understand the impact, and understand the state of current practise, to fix things and to advise on new best practises.

Mar 25 '25 15:03 notatallshaw

I understand that people are frustrated, but I don't think this tone is really a good way to start a constructive discussion. If you start with hostility and demands, the discussion is likely to derail into further hostility, counter-hostility and we will be unlikely to reach any solutions, or even focus on the right problems.

Yes, I am often frustrated too. Gentoo Linux is a source-first distribution, so the vast majority of our users are building from source, and are downstream to all Python build systems. Major disruptive changes bring a lot of frustration, because in the end they put the burden of fixing the issues on others.

But I do realize that setuptools wants to go forward, and doesn't want to have to support deprecated functionality forever. However, I feel like it's an ecosystem problem.

Firstly, we do not have a reliable way of reporting deprecations at build system level. The way it is currently done, deprecation warnings are printed in the middle of a very verbose build log produced by setuptools, and are easily missed. On top of that, some tools don't print build log at all. And even if we solved that and made wheel building/installing tools catch deprecation warnings and make them more visible, a lot of packages are only building wheels as part of their CD pipeline, and rarely look at the output. Of course, there's the option of having a -Werror style behavior, but as you can see, we've already gotten quite a few requirements, and we expect every project to implement that.

Secondly, Python packages get abandoned. We're dealing with dozens of packages that are either not maintained at all or barely maintained — and many of them obviously still "work", at least as far as users are concerned, and are dependencies of many projects. They often use setuptools; some of them don't supply wheels for one reason or another. Breaking changes break these packages, and when there's no active maintainer to quickly make a new release, things go very wrong (pipelines get broken, people start independently forking the same package with no intent on future maintenance…).

Mar 25 '25 15:03 mgorny

@mgorny Totally hear you.

This really is an ecosystem problem, and that’s exactly why setuptools is in a great position to lead the way. The pushback isn’t about avoiding progress—it’s about wanting changes to be smoother, more transparent, and less disruptive.

This could be an opportunity to do things better: clearer deprecation paths, better tooling, and more open comms so we move forward without breaking trust along the way.

Mar 25 '25 15:03 ianballard

Why do you not have an internal cache (artifactory, docker, etc.) to protect you from external build infrastructure changes?

We do. Almost all packages just pass through, unless we discover something that causes us to decide otherwise. But every package that lives on the blocked list increases the burden on us (across thousands of end users).

And if a tool doesn't state what versions deprecations will take place, it is impossible to choose what version constraints to choose. <-- Edited to add this

For your own projects why are you depending on setuptools and note a more modern build system? (e.g. flit, hatchling, PDM, poetry_core, and soon uv_build).

Those are all less mature projects than setuptools. We are expecting setuptools to act like a mature project with ecosystem-wide dependents.

Mar 25 '25 16:03 jamesliu4c

Those are all less mature projects than setuptools. We are expecting setuptools to act like a mature project with ecosystem-wide dependents.

For better or worse setuptools comes with a lot of legacy as it predates "modern" Python packaging standards, so in some cases it is less mature in supporting specific standards as it has taken the maintainers time to migrate.

Flit is now over 10 years old, designed to be simple, and focus on standards: https://github.com/pypa/flit. If you are looking for stability, which is what I assume you are using "mature" as a proxy for, then flit might be a better choice for your own build system.

Mar 25 '25 16:03 notatallshaw

For your own projects why are you depending on setuptools and note a more modern build system? (e.g. flit, hatchling, PDM, poetry_core, and soon uv_build).

We're depending on uv, but uv in a CI environment apparently installs the latest setuptools as part of the bootstrap process for its build environment. We depend on pylatex which it seems doesn't supply a wheel on PyPI, which means our builds broke. (uv sync doesn't respect build-dependencies pinning due to a bug, so there was no way to even fix it, well at least not that worked with third-party repos that don't have timestamps on them (e.g. torch).)

Are we seriously suggesting every end-user of Python audits all their third-party deps, figures out how to repackage all of those that don't provide wheels, self-host them, and set up CI processes to automate all that? As a simple end-user of Python, that sounds like a pretty crazy solution. Almost what's even the point of having a package manager and PyPI index/repo at that point?

Mar 25 '25 17:03 herebebeasties

We're depending on uv, but uv in a CI environment apparently installs the latest setuptools as part of the bootstrap process for its build environment.

It doesn't, you should outline your scenario to uv and see what that's happening and fix it.

If you create a project with uv the default build backend is currently hatchling (soon uv_build). Otherwise it should take whatever you've specified in your pyproject.toml.

uv sync doesn't respect build-dependencies pinning due to a bug, so there was no way to even fix it

It absolutely respects build dependencies, there's a bug that it doesn't respect --build-constraint, which I believe uv is fixing with high priority.

Mar 25 '25 17:03 notatallshaw

We're depending on uv, but uv in a CI environment apparently installs the latest setuptools as part of the bootstrap process for its build environment.

It doesn't, you should outline your scenario to uv and see what that's happening and fix it.

Maybe I don't understand how this stuff works, it's certainly complicated enough. I think what I meant to write is that by default, uv builds all packages in isolated virtual environments, as per PEP 517. For pylatex and seemingly many many other projects that are distributed as sdists, this environment pulls in the latest setuptools as part of what I might incorrectly be labelling the "bootstrap" phase of building the module. This was reproducible locally if you simply blew away your UV cache and ran a uv sync.

uv sync doesn't respect build-dependencies pinning due to a bug, so there was no way to even fix it

It absolutely respects build dependencies, there's a bug that it doesn't respect --build-constraint, which I believe uv is fixing with high priority.

Apologies, I meant to write build-constraint-dependencies - see https://github.com/astral-sh/uv/issues/12441

Mar 25 '25 17:03 herebebeasties

Maybe I don't understand how this stuff works, it's certainly complicated enough. I think what I meant to write is that by default, uv builds all packages in isolated virtual environments, as per PEP 517. For pylatex and seemingly many many other projects that are distributed as sdists, this environment pulls in the latest setuptools as part of what I might incorrectly be labelling the "bootstrap" phase of building the module

Ah, the point of confusion is you replied to my point saying for: "For your own projects", so I assumed you were talking about your own project, but it appears you're not.

For installing other projects the build system is out of your control, other than in general it's better to install from wheels rather than sdists where possible, and then the buld system isn't involved.

Mar 25 '25 17:03 notatallshaw

And if a tool doesn't state what versions deprecations will take place, it is impossible to choose what version constraints to choose. <-- Edited to add this

I feel compelled to link @hynek's brilliant article here: https://hynek.me/articles/semver-will-not-save-you/#taking-responsibility. Using PIP_CONSTRAINT is a great stop-gap that I've been employing for years to make my CIs evergreen and retain control over handling things that break.

P.S. @ianballard since you've created the issue using the blank template, you probably didn't see the CoC link present in other forms. So I'll leave it here for you: https://github.com/pypa/.github/blob/main/CODE_OF_CONDUCT.md. I'm sure you didn't mean it, but your OP reads like an attack. Please, make an effort to interact constructively. This is no place for making demands towards maintainers that gifted you their free labor.

Mar 25 '25 18:03 webknjaz

P.S. @ianballard since you've created the issue using the blank template, you probably didn't see the CoC link present in other forms. So I'll leave it here for you: https://github.com/pypa/.github/blob/main/CODE_OF_CONDUCT.md. I'm sure you didn't mean it, but your OP reads like an attack. Please, make an effort to interact constructively. This is no place for making demands towards maintainers that gifted you their free labor.

Thanks for linking Hynek’s article—it makes a strong case for why clarity and predictability around changes matter, especially in a critical part of the ecosystem like this.

I’m deeply grateful to the contributors who maintain and improve setuptools. I recognize the time, care, and effort involved—and that much of it goes unacknowledged. Your work is appreciated.

I want to be clear that nothing in my original post was meant as an attack. The concerns I raised were serious because the impact was serious. Calling for transparency and accountability—especially in light of how dismissive some contributor responses were—is both reasonable and necessary.

This change caused widespread breakage without clear timelines, migration paths, or safeguards. Raising that isn’t hostility—it’s about protecting trust and pushing for processes that match the project’s role in the ecosystem.

Again, I truly appreciate the work that goes into this project. But as a community, we need to address how this was handled so it doesn’t happen again—especially the way concerns were dismissed after the fact.

Mar 25 '25 19:03 ianballard

I oppose @ianballard's suggestion. I'd rather volunteers build whatever they feel like building without the burden of feeling that they owe anything to anyone. Without them there would not be any setuptools at all. These people should have good, honest fun and nothing else.

Mar 25 '25 19:03 sinoroc

@webknjaz

P.S. ianballard since you've created the issue using the blank template, you probably didn't see the CoC link present in other forms. So I'll leave it here for you: https://github.com/pypa/.github/blob/main/CODE_OF_CONDUCT.md. I'm sure you didn't mean it, but your OP reads like an attack. Please, make an effort to interact constructively. This is no place for making demands towards maintainers that gifted you their free labor.

Like @Ianballard, I appreciate all of the work provided by the PyPA.

Even so, I don't think this issue reads as an attack. It was well-reasoned, constructive, and calm. Especially so given the amount of disruption that was caused yesterday. Yes, there are serious concerns being raised. Those concerns do lead one to wish for changes. But how does one advocate for changes without stating what those changes are?

Nevertheless, all of the questions are in many of our minds. Refusal to engage is one way of addressing them, and if that is what comes of it, then we know where we stand. As for the urgent requests, they are requests, not demands. If the PyPA feels they are unreasonable, or cannot be met with present resources, that is fair, we can have a discussion about it. What can we, as a community, do to help so that we can have the requests met? What can I do? I am willing to contribute.

Mar 25 '25 19:03 jamesliu4c

This change caused widespread breakage without clear timelines, migration paths, or safeguards. Raising that isn’t hostility—it’s about protecting trust and pushing for processes that match the project’s role in the ecosystem.

One good solution users who can afford to invest the time in it, is to trust updates less, if a faulty update from a volunteer project is going to cause you unacceptable losses, I strongly suggest you design your build system not to trust updates or to minimize the amount it has to trust.

At my work we had no impact, even though several of our dependencies were affected, largely because we avoid sdists, cache built sdists, and use docker layers to cache all dependencies. I appreciate though this is not free, but if the cost of failing CI is high enough, it's probably worth it.

Again, I truly appreciate the work that goes into this project. But as a community, we need to address how this was handled so it doesn’t happen again—especially the way concerns were dismissed after the fact.

As part of the community are you going to organize, or provide, funding or resources to the setuptools maintainers to address these urgent requests and reviews? You haven’t clarified on that, and if not, what expectations do you have for volunteers to work on these urgent requests you've asked for?

Maybe I'm misguided here and setuptools maintainers are happy to work on your urgent requests out of their own time, but when I request others do work for me urgently I come with a proposal on how to fund it.

Mar 25 '25 20:03 notatallshaw

As part of the community are you going to organize, or provide, funding or resources to the setuptools maintainers to address these urgent requests and reviews? You haven’t clarified on that, and if not, what expectations do you have for volunteers to work on these urgent requests you've asked for?

I did touch on this earlier, but to clarify: yes, I understand that maintainers are volunteers, and I deeply appreciate the work that goes into sustaining this project. I’m advocating for stronger processes around high-impact changes—especially when those changes break downstream projects, including setuptools’ own tests, as seen in PR #4909. Based on this, it seems like some of these processes are already in place so no extra work needed, but they were missed in this case. To move forward, we need to be honest about what happened so we can learn from it and make sure it doesn’t happen again.

This isn’t about blaming individuals or holding volunteer maintainers to unrealistic standards. It’s about recognizing that critical infrastructure needs extra care, and that trust depends on how change is introduced and communicated.

If resources are a blocker to improving those processes, I’d be happy to help facilitate that conversation. But we can’t just ignore the problem.

Mar 25 '25 20:03 ianballard

Maybe I can clarify a little bit by saying what kind of governance changes I would like to see. I opened a new issue with a specific change I would like made about the deprecation in question.

Can we get some open and honest discussion of that issue?

Mar 25 '25 21:03 jamesliu4c

Setuptools is a foundational project with deep, transitive integration across the Python ecosystem. The concern raised is not about the technical direction of the project, but about process, governance, and the way this specific breaking change was introduced:

Sure, let's talk about that.

No migration tooling or opt-out path was provided.

It's hard to imagine "migration tooling" for the task of replacing a hyphen with an underscore in setup.cfg. Maybe there's a generic config-file normalizer out there.

An "opt-out path" was indeed provided: not using a new major version of Setuptools. That's why Setuptools follows semver: to advertise breaking changes.

Without the ability to make breaking changes, there is no hope of ever cleaning up legacy cruft in projects. I would wager that for the overwhelming majority of projects using Setuptools, only a small fraction of Setuptools code paths are relevant, or ought to be relevant. Setuptools is freshly downloaded ~20 million times a day for all these isolated build environments (it seems like local caches aren't doing their job a lot of the time, which is a separate issue) so being able to cut down the wheel size would save quite a bit of resources. Removing unnecessary code also tends to speed up Python projects, because importing it also has a cost. (If you don't believe me, try timing how much overhead the --python option causes in Pip, versus a direct subprocess invocation of a new Python process.)

Many users struggled to fix broken builds because their transitive dependencies didn't specify a Setuptools version cap and they either couldn't restrict the Setuptools version their tools used for isolated build environments, didn't know how to set up a non-isolated build environment, etc. It's hard to say that this is Setuptools' fault.

But that's for users who actually have to build those packages locally. And, as @notatallshaw pointed out, at least part of that problem is caused by package maintainers not pre-building trivial projects (e.g. ones that are completely in Python aside from dependencies) when they should. Which would also flag the issue for those maintainers. (They're the ones the deprecation notices are intended for in the first place!)

(By the way, Damian: you say correctly that your local caching isn't free, but I can imagine it being less expensive in the long run than repeatedly downloading things and building them fresh - including new downloads for build dependencies. Not to mention the costs borne by others. Bandwidth isn't free either, and PyPI operates at petabyte scale - as I'm sure you're aware. Let me take this opportunity to say: thank you, Fastly!)

No deprecation period or clear policy was referenced.

The underlying deprecation occurred four years (twice as long as was recently deemed appropriate for the Python standard library) and 24 major versions ago. How long is enough?

The change broke the project’s own tests and many others in the ecosystem.

Yes. That's why it was explicitly labelled as a breaking change, and published following accepted standard practices for breaking changes (i.e., a bump of the major version number, in a project that advertises that it uses semver).

The tone of communication from core contributors was dismissive and unprofessional when concerns were raised.

The response was initially dismissive because the scope of the problem wasn't immediately obvious. But the change (again: following up on a deprecation from four years ago) was reverted within less than six hours. In my book, actions speak louder than words - I find it hard to call that "dismissive".

I didn't see anything from the developers that I'd consider unprofessional. I did see a ton of noise from third parties, typically in forms like "this also affects XYZ package", "this is outrageous", etc. None of that helps. (What does help is e.g. the comment that included search results for the now-broken key in setup.cfgs on GitHub.)

This is about the standard of care expected from maintainers of critical infrastructure.

Organizations that would be severely disrupted by a bleeding-edge version of a package being unusable for their purposes for a few hours, ought to already know how to avoid having their tooling automatically pull in bleeding-edge versions. It could just as easily happen to their install-time dependencies, after all. (And in that case, they could spend considerable time on a build only to discover a problem in testing.)

The same sort of thing happened with Setuptools 72. (It took a bit over 13 hours to revert that change, but it was overnight in North America.)

And very little of the outrage seems to be getting directed at that tooling. Instead, the Setuptools issue filled with comments from people trying to help each other with workarounds for that tooling.

I actually agree that the specific change in Setuptools 78 shouldn't have been made - at all, without a compelling case that not breaking it would get in the way of something useful. It's pretty trivial to leave in a few lines of code to normalize case, hyphens vs underscores, etc. Going forward, I'd like to see Setuptools batch together planned breaking changes and announce them. (Not that I'm particularly hopeful that the right people would pay attention to the announcements.) It seems a little ridiculous to me that we're on major version 78 of Setuptools, and trying to make changes like this only to be forced into reverting them is certainly a contributing factor to that.

But the response I've seen so far looks like a pitchfork mob, and a largely misdirected one at that.

@tiran :

I'm working on a Linux distro. The test instance of our build pipeline was affected by two breaking changes in setuptools in the past 7 days. Our test pipeline builds with the latest version of all our dependencies to detect these kinds of breaking changes.

Did this cause a significant problem? Having discovered in the test pipeline that the latest version of a build-time dependency fails to build a given package, is there not any system in place to roll back the build-time dependency? (I assume you are not reliant on Pip etc.'s build isolation.) Is there an organizational reason the package can't be built and released with the prior, known-working dependencies? After all: the package can't really be expected to know which (if any) future version of Setuptools will break on its current configuration data. The maintainers don't have a crystal ball. They can be conservative, but some people think that's a bad idea, even for build-time dependencies.

@herebebeasties :

Are we seriously suggesting every end-user of Python audits all their third-party deps, figures out how to repackage all of those that don't provide wheels, self-host them, and set up CI processes to automate all that?

I assume you're talking about the person who runs pip install. Known emergency workarounds include:

Specifying an upper-version cap for Setuptools in a constraints.txt file, and then telling Pip to use that via the PIP_CONSTRAINT environment variable
Manually setting up a build environment with e.g. setuptools<78, and then pass --no-build-isolation to pip install

Once Pip succeeds in building a wheel, it's cached - so I can't fathom why people would be concerned with self-hosting wheels or setting up CI processes unless they already have those things. In the hypothetical world where, say, Requests doesn't ship a wheel, end users can obtain a wheel this way and install it in arbitrarily many virtual environments.

Mar 26 '25 00:03 zahlman

Did this cause a significant problem? Having discovered in the test pipeline that the latest version of a build-time dependency fails to build a given package, is there not any system in place to roll back the build-time dependency? (I assume you are not reliant on Pip etc.'s build isolation.) Is there an organizational reason the package can't be built and released with the prior, known-working dependencies? After all: the package can't really be expected to know which (if any) future version of Setuptools will break on its current configuration data. The maintainers don't have a crystal ball. They can be conservative, but some people think that's a bad idea, even for build-time dependencies.

At least for Gentoo, it is certainly possible to roll back the dependency. Actually it's a generalization of the appropriate caution to take with new versions of a foundational package which has an above-average tendency to have major versions with breaking changes that do, as it happens, break things.

Simply put: have a globally consistent ecosystem of thousands of packages where every package is expected to work with the latest versions of all its dependencies, and when packages are not compatible:

the common case is to patch them (and submit patches upstream to enable support for newer versions so that upstream can decline the patch by saying only pip install is supported)
the setuptools case is to hard-mask newer setuptools versions so that nobody can install updates to setuptools (hopefully until a new setuptools fixes the regression)

This is necessary mostly not because of deliberate breaking changes, but because setuptools historically used the Monkeypatch Oriented Programming (MOP) model and now making changes tends to affect other components at a distance (plus literally all of its code is a public API for people writing setuptools plugins, and any package using setup.py counts as writing a setuptools plugin). So it's become relatively clear that any new setuptools version, whether it's a semver major or not, should be treated with caution and the default approach should be to deny it until it's been proven.

(A distro can get away with this because a distro can do repodata patching. For example, Conda is a distro, and can do it as described at https://prefix.dev/blog/repodata_patching and https://bioconda.github.io/developer/repodata_patching.html. Some Linux distros use incremental patch updates to an entire package including the metadata, or simply rebuild that one package with the same version but fully fixed metadata. However, with PyPI it is not possible to patch metadata at all, only upload new versions, since every uploaded asset is immutable.)

...

But, figuring out that there is a breaking change is still annoying to deal with.

Manually setting up a build environment with e.g. setuptools<78, and then pass --no-build-isolation to pip install

The suggestion that build isolation is a problem that needs to be solved by ceasing its use, is humorous to me for a variety of reasons.

Mar 26 '25 01:03 eli-schwartz

Appreciate the detailed response—there’s a lot here I agree with.

To clarify my position again: I’m not saying breaking changes should never happen, or that semver isn’t valid. But when a change breaks large parts of the ecosystem—including the project’s own tests—it’s a clear sign that either (a) the ecosystem wasn’t ready, or (b) the rollout process didn’t sufficiently account for downstream realities.

Yes, the deprecation was years old. But if the warning didn’t reach many users, and the enforcement caused immediate breakage across critical tools, then it’s worth examining whether the communication and deprecation strategy were effective—not just pointing to the time that passed.

I do appreciate that the change was reverted quickly—that’s absolutely the right move. But the speed of the rollback doesn’t erase the impact or the tone that many experienced during the incident. When users are told it’s their fault for not having ideal infra or for depending on old packages, it understandably feels dismissive—especially for smaller projects and teams without the resources of larger orgs.

This isn’t a pitchfork moment. It’s about setting a higher bar for foundational projects. That includes not just bumping major versions, but making sure communication, grace periods, and rollout strategies match the scale of the impact.

I actually agree that the specific change in Setuptools 78 shouldn't have been made - at all, without a compelling case that not breaking it would get in the way of something useful. It's pretty trivial to leave in a few lines of code to normalize case, hyphens vs underscores, etc. Going forward, I'd like to see Setuptools batch together planned breaking changes and announce them. (Not that I'm particularly hopeful that the right people would pay attention to the announcements.) It seems a little ridiculous to me that we're on major version 78 of Setuptools, and trying to make changes like this only to be forced into reverting them is certainly a contributing factor to that.

And that actually highlights what many people are trying to say. The change didn’t have clear benefits, yet it was pushed through despite knowing it would cause downstream disruption. That’s exactly the kind of decision-making we’re asking to be more thoughtful.

If this prompts Setuptools to handle future breaking changes more carefully, and revisit how deprecations are communicated and enforced, that’s a positive outcome. That’s the goal—not perfection, just more care.

And on that note, it’s great to see some traction around @jamesliu4c’s proposal in #4921. That kind of discussion is exactly what we need.

Mar 26 '25 01:03 ianballard

Just to clarify, I don't argue for forgoing build isolation - but avoiding the installer's automated build isolation may be part of a fix. Hence "manually setting up a build environment". To forgo build isolation entirely would mean just using the environment where Pip was already installed. (You can reinstall Pip in the new build environment; or you can reuse the existing Pip with the --python option.)

Mar 26 '25 03:03 zahlman

Just to clarify, I don't argue for forgoing build isolation - but avoiding the installer's automated build isolation may be part of a fix. Hence "manually setting up a build environment".

I disagree with this characterization, you're trying to draw a distinction where none exists. :)

If the installer doesn't automatically do it, it's not build isolation. It could be Linux distribution "Reproducible Builds", though.

To forgo build isolation entirely would mean just using the environment where Pip was already installed. (You can reinstall Pip in the new build environment; or you can reuse the existing Pip with the --python option.)

But this part is the part of your statement that is the most interesting part.

If you reinstalled pip in a new build environment then running pip install doesn't make any sense. And this ties back into your previous claim:

then pass --no-build-isolation to pip install

If you're installing a package in the manually set up environment, it's obviously the environment that you're installing into. If you just wanted to manually set up a throwaway build environment then I daresay you'd want to use python -m build -nwx for that, followed by leaving the building environment and running pip install dist/*.whl.

Depending on the wheel in question, installing various unneeded copies of it into temporary environments just to inject a pre-installation pip cache entry can be quite... wasteful. And some of those packages have the biggest reason to not have wheels on pypi -- because building large C++ code for every platform in existence doesn't work out well when you use a slightly unusual environment, or even simply upgrade your python version on a different timeframe.

Also brave to assume a cache is necessarily active. For several reasons including the fact that pip's cache is extremely aggressive about caching the wrong things despite the user passing options which should invalidate the cache, but do not, resulting in incorrect results being installed.

Mar 26 '25 03:03 eli-schwartz

[Setuptools maintainers: Apologies if this is too far off-topic. This will be my last reply.]

If the installer doesn't automatically do it, it's not build isolation. It could be Linux distribution "Reproducible Builds", though.

I understand "reproducible builds" to refer to a different (and much more strict) idea. Simply manually setting up a build environment is not enough to ensure that you get a byte-for-byte identical wheel.

But if you manually create an environment that is specifically for the purpose of building a specific package, which stands separate from other environments, what could that be called other than "isolation"?

If you reinstalled pip in a new build environment then running pip install doesn't make any sense.

I meant, specifically for the purpose of getting the actual build dependencies installed into that environment.

If you just wanted to manually set up a throwaway build environment then I daresay you'd want to use python -m build -nwx for that, followed by leaving the building environment and running pip install dist/*.whl.

Yes, you're right - not quite sure what I was thinking. Pip's "no build isolation" is fundamentally the same idea as build's - implemented using the same pyproject-hooks - but it also doesn't distinguish the build environment from the install environment. --python doesn't help there, since it can change which install environment is assumed but then it's also still the build environment. Python application developers shouldn't be stuck with no-longer-needed copies of Setuptools in their per-project venvs.

Other options like --target, --prefix, --python-version etc. should theoretically be able to solve the problem, but it turns out they really just don't work properly (e.g. https://github.com/pypa/pip/issues/13296) and would provide a terrible UX for this anyway.

Not that expecting end users to fall back on build is much better - the goal was to automate the whole process after all 😉

But I already know (and have already seen) that the Pip team will resist trying to improve this if it's just to mitigate the risk of future Setuptools breakages - and honestly, I don't blame them. (Greenfield projects like uv - and my own, paper - may feel much less burdened by the idea of offering fine-grained control over build environments to end users.)

Mar 26 '25 04:03 zahlman

I understand "reproducible builds" to refer to a different (and much more strict) idea. Simply manually setting up a build environment is not enough to ensure that you get a byte-for-byte identical wheel.

But if you manually create an environment that is specifically for the purpose of building a specific package, which stands separate from other environments, what could that be called other than "isolation"?

It's not isolated, not even slightly. :) But reproducible builds does mean, as part of it, that you install predictable versions of build dependencies. Or as the kids these days call it, lockfiles.

The entire operating system stack is available, including possibly a number of components in a single globally consistent python environment, which is the one you build in.

For a practical example, on Gentoo the package manager is a python package, and depends on a second in-house package that in turn depends on requests. It's impossible to build any package using Gentoo's package management without exposing requests to the build process.

PyPA style build isolation is specifically about building every package isolated from system packages. The original intent, as far as I understand it, was to "catch" people who had buggy setup.py setup_requires that didn't specify everything which was needed, because the developer already had it installed. It's a linter that ran completely out of control.

The other major effect that building outside of a virtualenv had is that sdists (not wheels) built with setuptools-scm installed, accidentally contained all the necessary project files, when individual projects would much rather tell redistributors that the testsuite isn't supported for use by redistributors. A quite minor "problem" to found an entire "isolated builds" culture around.

I am to date unaware of anyone that had a broken wheel published as a result of having additional packages installed. Assuming that it could ever be a problem feels silly to me: your dependencies can silently add additional runtime dependencies at any time, and those additional packages will end up in build isolation too. Nope, I'm quite certain it's all about the linting for underspecified build dependencies. Like I said, a linter that ran completely out of control.

What does break wheels all the time is packages that have stale build/ which is persisted across "build isolation", but nobody talks about that. Or if they do talk about it, they say the answer is to only ever build wheels in github actions, which is apparently easier than designing a build system that records and tracks sources, or runs distchecks against an exported git snapshot: https://mesonbuild.com/Creating-releases.html#autotools-dist-vs-meson-dist

tl;dr avoiding build isolation gives you significantly greater control over how your dependencies are built, leading to more reliable builds. And also unlocks your ability to prevent setuptools updates from unpredictably breaking your CI. :)

Mar 26 '25 06:03 eli-schwartz