pymc-examples Proposal and discussion: Overhaul and maintenance plan for this repository

There's been a recent discussion on the PyMC Labs Discord (Oct 31 – Nov 3, 2025) with several contributors. @benmaier, @fonnesbeck, @OriolAbril , @williambdean , and @ricardoV94, explored the current state of the pymc-examples repository and possible approaches to modernizing it. @OriolAbril and @ricardoV94 rightfully pointed out that this discussion should happen with the whole team, not on the labs discord.

Current state:

The pixi.toml is outdated and missing dependencies required by several notebooks.
After minimal modernization (pymc3→pymc, theano→pytensor), only ~60 of 136 notebooks ran successfully on the current pymc stack
Many remaining notebooks use deprecated APIs, outdated sampling patterns, or rely on long-running models.
There is no clear communication on the examples site that some notebooks may target older PyMC versions. watermark is used at the end of each notebook, but there's no clear communication about it.
There is no automated CI/CD process to execute or validate notebooks regularly.

This is not only problematic for users who try to learn how to use pymc for the first time, it's also making things harder for LLM coding agents that find outdated code through web searches.

This issue aims to summarize the discussion that happened so far and open it up to the community. The intent is to propose a coordinated plan to bring the examples closer to the latest PyMC versions while improving transparency for users and maintainers.

Key Points from previous Discussion

1. General Consensus

There is strong interest in updating and maintaining the notebooks.
Keeping all notebooks runnable with the latest PyMC stack is desirable, though challenging.
Automatic or semi-automatic migration (e.g. via recently developed pymc-migrate-notebook agent) is promising, but human review remains essential.

2. Challenges

Past migration attempts were slowed by the review process.
Some notebooks (especially GP or ODE examples) are computationally intensive, taking > 30 min to run.
main branch currently serves as the development branch; CI ensures only formatting, not runtime correctness.
pymc-examples notebooks are integrated into the official documentation site, but users aren’t directly informed that some may reference older APIs.

3. Suggestions

Adopt a runtime rule (e.g., notebooks should complete in ≤ 10 min), with exceptions listed in a blacklist.
Use a mocked or partial notebook runner (as in PR [#740](https://github.com/pymc-devs/pymc-examples/pull/740)) to check structure and syntax without full sampling.
Consider automated reruns (e.g. papermill / nbconvert) when dependencies (e.g. pixi.toml) change.
Use a migration agent (e.g. pymc-migrate-notebook) to update failing notebooks, with manual verification before merge.
Communicate clearer version compatibility information to users via the website and README.

Proposed Actions

A. Infrastructure

[ ] Update pixi.toml to include all dependencies required by current notebooks.
[ ] Add CI that programmatically executes (or mock-executes) all notebooks to detect runtime failures. We need to discuss at which times this runs. E.g. once a week or when a new pymc version is released
[ ] Implement a blacklist.yaml (or similar) for notebooks intentionally excluded from automated runs because they're computationally too expensive

B. Migration Process

[ ] Run pymc-migrate-notebook agent across failing notebooks to modernize code.
[ ] @benmaier proposed to open an overhaul branch to 1st commit (i) pixi.toml, and (ii) all notebooks that have been changed with basic replacements (pymc3 -> pymc, theano -> pytensor). Then, single PRs (one for each notebook) should be opened to merge into overhaul. Finally, overhaul should be merged into main. @ricardoV94 noted that that seems unnecessary and that one PR per notebook to merge to main should suffice.
[ ] Tag appropriate reviewers depending on the notebook domain, for instance the original author(s), or @OriolAbril, @williambdean, @ChrisFonnesbeck

C. Documentation & Communication

[ ] Add a clear note in the README and/or on the examples site:
- Explaining that some notebooks target specific PyMC versions.
- Indicating that “latest” notebooks are runnable on the current PyMC release.
[ ] Optionally add visual cues (banner, tag, or section) for:
- ✅ Up-to-date notebooks
- ⚠️ Older notebooks (compatible with PyMC < 5)
[ ] Ensure watermarks are visible on top of the document

D. Governance / Workflow

[ ] Optionally schedule automated reruns (weekly or on dependency updates).
[ ] Encourage community contributions via issues for each notebook not running error free on the latest pymc version

Next Steps

The team should discuss this proposal, in particular the following:

Agree on whether an initial overhaul should be done as proposed or similarly, such that all or close to all notebooks execute error-free in their entirety on the current stack. This includes updating pixi.toml as the first PR. Also: agree on whether updates should come by continuously merging into main or a using an overhaul (dev/staging) branch to merge into, then finally merging into main once everything is complete.
Agree on whether a CD/CI pipeline should be created to regularly (when?) execute all notebooks on an updated stack (includes updating pixi.toml)
Discuss and agree on whether an automated migration-agent could be used in this CD/CI to update notebooks that do not execute successfully.
Define maintainers / reviewers for notebook categories.
Discuss whether users should be more clearly informed about a notebook/example using a potentially outdated API.

Nov 06 '25 10:11 benmaier

I think giant branch is just less likely to work because of the last 20% effect. Getting 80% in gradually is already a huge improvement for everyone so I don't see why delay it.
Maybe whenever an intermediate pymc version is released vx.y.z -> vx.(y+1).z, that's when API breakage is bound to happen
Not sure what that means. It could open a PR for review?
Gonna go stale?
That sounds really great improvement

Nov 08 '25 08:11 ricardoV94

I basically agree with @ricardoV94.

For point 1, one of the main goals of having different versioning schemes between the library docs and the example notebooks is making sure fixes and improvements to the notebook (which happen with the latest release) appear as quickly as possible on the website instead of needing a pymc release for them to be deployed. Unless there are strong reasons against that for this specific case we should maintain the workflow of smallish PRs into main. To emphasize @ricardoV94 point about progress stalling:

There is strong interest in updating and maintaining the notebooks.

Past migration attempts were slowed by the review process.

Combining these two points, I think the strong interest has always been there, but interest doesn't mean availability and capacity. We have gone through multiple rounds of updating the notebooks in bulk. But all these have happened basically on a volunteer based capacity which made it challenging to do the updates alone. Combined with the fact that both the pymc library and best practices in bayesian workflow are in active evolution, we didn't even manage to catch up and after a bit the enthusiasm fizzled out.

For point 2 I would probably say we should run this whenever a new PyMC release is out independently of the type of release. I am not sure it would be a good use of our time to run all the notebooks on a PyMC version installed from github, so running multiple times between PyMC releases would have the same result the vast majority of times.

For point 4 we have/had this list already, I remember filling the types of notebooks I was happy to review. I can only assume it is now outdated as well as not very discoverable.

I am particularly happy to help with point 5. We haven't been enforcing the date updates in the "blogpost metadata" but if it is easy to ensure PRs keep that up to date we could use that to indicate the last time the notebook was executed on the right sidebar along with its tags. This template that defines the portion at the top of the right sidebar with the notebook tags already accesses information from the post directive so getting the date too should be quite straightforward.

Regarding enforcement, given we do have a pre-commit check to ensure the watermark cell is there, maybe we could add a second check to ensure both dates match? Even have pre-commit modify the post directive metadata with the date from the watermark cell if we can achieve that robustly.

Another idea could be adding a link also at the right sidebar with text along the lines of "I tried running the notebook and if failed". We could get rid of the edit on github and show source ones because they aren't very useful for notebooks. That page could have an explanation/faqs on things that tend to go wrong and recommendations on what to do: up/downgrade env to match the one in pixi.toml/lock, checking for issues and opening if necessary, check out pymc release notes...

Adopt a runtime rule (e.g., notebooks should complete in ≤ 10 min), with exceptions listed in a blacklist.

I lean towards arguing against this, but I think it will greatly depend on the implementation details. I think it is still very valuable to have real world examples in the gallery, even if we can't afford to keep them up to date, and I think measures like this would hint towards the opposite idea. But we could also decide that such notebooks should live somewhere else like the pymc blog instead: https://www.pymc.io/blog.html

Nov 08 '25 16:11 OriolAbril

We could run the pymc-migrate-notebook agent as github actions and have it incorporate feedback from github directly. We've set something similar up elsewhere.

Nov 09 '25 08:11 twiecki