Checking API, ABI & bytecode stability in pre-merge CI
Spotting ABI compatibility breaks, bytecode compatibility breaks, and public API additions in maintenance patches is currently still reliant on human code review, and it's unfortunately pretty easy to miss the negative implications of seemingly innocent low level changes.
However, there are tools available that should allow us to automate at least some of those checks in Travis CI and Appveyor.
This is a tracking issue to start collecting some of the items we'd need in order to be able to do effective checks for all of these items for the maintenance branches, and for the stable ABI on the development branch.
- [x] Bytecode magic number stability checking: https://bugs.python.org/issue29514
- [ ] Document the compatibility expectations somewhere in the developer guide (or in PEP 387?)
- [ ] Export public C API symbols for given X.Y.0 release (autotools build toolchain)
- [ ] Export public C API symbols for given X.Y.0 release (MSVC build toolchain)
- [ ] Check currently exported symbol list against reference list for maintenance branches
- [ ] Export public C API symbols for given Py_LIMITED_API setting (autotools build toolchain)
- [ ] Export public C API symbols for given Py_LIMITED_API release (MSVC build toolchain)
- [ ] Check Py_LIMITED_API exported symbol lists against reference lists for all branches
- [ ] Add release process step to export the symbol lists when entering X.Y.0 release candidate phase
- [ ] As above, but using
libabigailto look for ABI incompatibilities (e.g. struct size changes, function signature changes)
@encukou @zooba @brettcannon @ambv We'd been talking about automating some of our "maintenance release stability guarantees" for a while, so I figured it made sense to start a checklist of pre-requisites for doing that.
Also adding in the current release managers: @benjaminp @ned-deily @larryhastings (although I expect the bytecode check is the only one we'll backport to 3.5, since that branch is so close to entering security-fix only mode)
@ned-deily pointed out that a good starting point would be to clearly document the compatibility expectations somewhere in the Developer Guide.
And/or possibly a PEP?
@ned-deily That reminded me that @benjaminp's attempt at defining an explicit backwards compatibility policy is still open for review: https://www.python.org/dev/peps/pep-0387/
We nearly got the public symbols part scripted on bpo somewhere. For that, I think we want:
- write public API symbols to a text file that is checked in (so you see in PR that you changed it)
- RM copies text file to a new location for each release
- some sort of in-tree unit test to validate no breaking differences since X.Y.0?
Not sure how that turns into policy though. We need to allow public API changes, even between releases. Perhaps we actually want the tool first here so that we can set policy in terms of "permissible changes"?
@zooba I thought I recalled some previous work on public symbol lists, but my search-fu failed me. If you can figure out where it is, then I'll add the link to the original post.
The bytecode stability check has now been backported to all branches still in maintenance (I was initially puzzled as to why the 3.6 backport passed without setting a branch specific magic number, but then I realised we simply haven't needed any magic number changes in 3.7 yet - no new optimisation opcodes, and PEP 538 was the first PEP implemented for it).
What kind of script need to be used (a comparer or a generator) ? do the public symbol can be generated by nm ./python | sort | cut -d ' ' -f3?
See also the discussion in bpo-23903.
I was initially puzzled as to why the 3.6 backport passed without setting a branch specific magic number, but then I realised we simply haven't needed any magic number changes in 3.7 yet - no new optimisation opcodes
New LOAD_METHOD and CALL_METHOD opcodes. The test is skipped in 3.7 because it still is not released.
Thanks @serhiy-storchaka, that's the one I was thinking of.
Essentially, we can use the preprocessor to filter our header files, and then some form of code formatter/minimizer to avoid whitespace changes, and we should get stable and comparable output files containing the precise C API.