sympy
sympy copied to clipboard
[RFC] Improve sympy startup times by pre-generating the assumption rules
During sympy startup the assumptions rules are generated (see sympy.core.assumptions line _assume_rules = FactRules(...). This takes about 30 ms. We can eliminate this 30 ms by generating the data once, and then loading the result (loading time is < 1ms). A prototype for this is in https://github.com/sympy/sympy/compare/master...eendebakpt:performance/assumptions_startup
Is the 30 ms gain in startup time worthwhile the additional complexity?
If so, then we need to think about:
- How to store the pre-generated data (currently in plain python, perhaps there is a better format)
- How to to make sure that if the assumptions change we automatically get notified. We could add a unit test that compares the pre-generated data to a fresh generation.
An alternative is to make the assumption generation FactRules faster. From some quick profiling I estimate that it is possible, but I do not think it can be as fast as loading the pre-generated data.
I think this is a good idea. This is already done for the new assumptions (see bin/ask_update.py and sympy/assumptions/ask_generated.py).
Is the 30 ms gain in startup time worthwhile the additional complexity?
Yes, I think so. It's very rare to change the rules that define these although it does sometimes happen (e.g. #23739).
- How to store the pre-generated data (currently in plain python, perhaps there is a better format)
I think that plain Python is fine. It just needs to be clearly annotated as code that should not be directly edited. It should also match style guidelines and flake8 etc (e.g. not have overly long lines).
- We could add a unit test that compares the pre-generated data to a fresh generation.
Yes, that's a good idea. That should be checked. It's unusual to change these rules so a test that they are up to date is needed to remind anyone who does try to change them.
An alternative is to make the assumption generation
FactRulesfaster. From some quick profiling I estimate that it is possible
That would be good in any case but obviously the value of speeding it up almost vanishes if it's only needed when there is an update to the inference rules rather than happening at import.