Integration feedback report: psf/black
Hello! So I'm Richard from the psf/black maintainer team, and for the past few months I've been working on finishing up the work done by Sully in https://github.com/psf/black/pull/1009 so mypyc can finally be deployed. As a token of appreciation (and to help out the project) I've written up a report containing my experience integrating mypyc with psf/black :)
First of all, here's the relevant links:
- Development branch: https://github.com/psf/black/compare/main...mypyc-support-pt2 (contains crash workarounds, optimizations, and QOL improvements) - https://github.com/psf/black/pull/2431
- Wheel building workflow + other infra + project management homebase: https://github.com/ichard26/black-mypyc-wheels
- Relevant issue on the psf/black repository: https://github.com/psf/black/issues/366
- Performance report: https://gist.github.com/ichard26/b996ccf410422b44fcd80fb158e05b0d
The good
- The performance wins are excellent: ~2x average across multiple workloads!
- As long as the hot code wasn't too spread out, optimizing for mypyc wasn't that hard (tightening up types, adding lots of
Final, etc.) - additional performance wins of 5-15% were achieved :D - There weren't actually that many crashes or other incompatibilities (ignoring dataclasses) and they all had easy enough workaround
- Wheel building was pretty easy and straightforward (once some changes were made to the
setup.pylogic) - I did steal lots of ideas from mypyc/mypy_mypyc-wheels :P
The bad
- Dataclasses were responsible for the majority of crashes and other compatibility issues. The cases that led to issues include:
- an dataclass that is also subclassing
ast.NodeVisitorcrashes withAttributeError: attribute 'dict' of 'type' objects is not writable(maybe not fixable?) - an dataclass inheriting from Generic (already reported in https://github.com/mypyc/mypyc/issues/827)
-
abc.ABC+ dataclass crashes too
- an dataclass that is also subclassing
- Unanalyzed or unreachable code continues to be problematic - a "fun" case involved this:
This was caused by> if sys.platform == "win32": RuntimeError: Reached allegedly unreachable code!mypy.inihard-codingplatform=linux. Other interesting case was one involving a module typed asAnycausing an else branch to appears as it's never ever taken. The resulting unanalyzed code crashed when hit. More details in this commit: https://github.com/psf/black/commit/acb77f7951229b2670edc3fad366ec256fa0c86f
And other various issues already reported on the mypyc issue tracker:
- https://github.com/mypyc/mypyc/issues/885
- https://github.com/mypyc/mypyc/issues/884
- https://github.com/mypyc/mypyc/issues/864
- https://github.com/mypyc/mypyc/issues/862
- https://github.com/mypyc/mypyc/issues/819 bit us too :(
Suggestions
- Add a note about
platform=$OSbeing problematic - Add a note about docstrings being stripped - the click CLI toolkit uses docstrings to get command descriptions ... which ended breaking without a clear reason why (until I stumbled on some issue on mypyc issue tracker that explained why :p)
- Prioritize https://github.com/mypyc/mypyc/issues/849 - while I'm OK with wheel sizes going up adding mypyc, the ~25-35x (120KB -> ~3.3MB) I observed wasn't the easiest thing to swallow. Adding
CFLAGS=-g0brought the wheels to a much more reasonable ~1MB each. - Prioritize https://github.com/python/mypy/pull/10395 - quick iteration and experimentation was slowed down by repeated cleaning up +
python -c "import foo" - Perhaps add a way to skip invoking the C extension build - this would make analyzing differences at the IR or C level faster ... although I would understand if this is too low-level and shouldn't be made easier
Questions
- Can anything be done about analysis + Python-to-C transcompilation being done twice during any modern wheel build? Unfortunately any evaluations of
setup.pyend up triggering the above and it turns out bothpip wheelandpython -m buildcall at-least two PEP 517 hooks that eventually end up evaluatingsetup.py. It's not a big deal, but it slows down wheel builds a little bit. Sounds hard to avoid tho (caching would be complicated). - What else can we do to help mypyc mature and overall get better?
What's next?
Currently the stability and performance data for Linux were just collected and look good so a PR was opened against psf/black. There's a lot of TODOs listed in that PR but it is progress.
The main thing is that I need to do is getting the wheels significantly more battle tested so they can eventually find their way to PyPI. First of all I need to run a mypy-primer equivalent on Windows and MacOS. Assuming that doesn't turn up anything scary, the PR will probably get merged once the reviews are over with. Then at that point, it's off to community testing where we'll be asking members of the community to use the experimental wheels and report issues. So expect updates on this issue as more stability data comes in ... but I'm sure this is already enough insights for a post ^^
Finally
As much as this is riddled with negativity and bugs, I appreciate all of the work y'all put into mypyc. It's awesome to see so much work into making Python code faster lately! :black_heart: If you have any questions, feel free to ask away.
It's been a long while I've updated this post, I'll try to write an update soon but the TL;DR is that we are shipping mypyc compiled wheels in Release 22.1.0 of Black. Other than https://github.com/psf/black/issues/2845 and unexpected unofficial API breakage it's been smooth sailing so far :)
I'm actually in the midst of writing a blog post describing my experience integrating mypyc into Black so that's currently eating my OSS time :slightly_smiling_face: If any mypyc core developers wants to review it once I finish it (no promises on when!) feel free to @ me and we can discuss. Obviously please don't feel like you must, it's just an invitation, I get we're all busy as OSS maintainers :slightly_smiling_face:
Once again, thank you for the great product!
So I'm going to assume that people have heard of my blog series since it kinda blew up :sweat_smile: but just in case, here's part one.
Anyway, there's not actually too much to comment on even after five months. With the help of Jelle, I was able to land fixes for https://github.com/mypyc/mypyc/issues/917 to allow us to use the latest mypy[c] to compile Black. Turns out that's not possible quite yet thanks to https://github.com/mypyc/mypyc/issues/941. https://github.com/psf/black/issues/2845 is still a pain and not much progress has been made on it, but there doesn't seem to be any other crashes or issues using mypyc with Black as of writing :tada:
For what it's worth, I've been learning C this summer and slowly familiarizing myself with the mypyc codebase so I can contribute fixes and other improvements. My hope is to be able to eventually fix the issues impacting Black myself, ... but some of these issues do look very involved so I'm not too sure about that :)