sympy
sympy copied to clipboard
Improve sympy startup time by pre-generating assumption rules
References to other Issues or PRs
Fixes #23848
Brief description of what is fixed or changed
The assumption rules in sympy.core.assumptions have been pre-generated and stored. This improves the import time from
Note: the first run (warm up) was not included in the average + std dev
All runs (including warm up):
[0.2799325890000546, 0.2792433899999196, 0.2808860589998403, 0.27852462000009837, 0.2772021640003004, 0.27793970099992293, 0.27890890300022875, 0.27670108499978596, 0.2830468789998122, 0.2824971650002226, 0.27993182499994873, 0.27861428000005617, 0.2989063120003266, 0.2775266109997574, 0.29364538099980564, 0.2771660369999154, 0.286309306000021, 0.2790996429998813, 0.2776118500000848, 0.2799282989999483, 0.28268679099983274, 0.286931328999799, 0.27870470200014097, 0.2795189109997409, 0.28090279800017015, 0.2771274270003232, 0.2784352640001089, 0.2790888449999329, 0.28215103899992755, 0.28110537000020486, 0.2792655839998588, 0.28015478599991184, 0.2742751440000575, 0.277146034999987, 0.2830514439997387, 0.30372588199998063, 0.278697158000341, 0.2799442360001194, 0.2820860720003111, 0.28028372000017043, 0.27603550999992876, 0.27661651399967013, 0.27680658300005234, 0.27522678399964207, 0.2784867990003477, 0.2751314780002758, 0.2808772350003892, 0.2758764670002165, 0.2807688479997523, 0.2850936009999714, 0.2838939059997756]
Number of tests: 50
The speed of "import sympy" is: 0.280676 +- 0.005417
to
Note: the first run (warm up) was not included in the average + std dev
All runs (including warm up):
[0.26176198099983594, 0.2641488809999828, 0.25052918699975635, 0.2563463709998359, 0.25801725200017245, 0.25252887999977247, 0.24990126400007284, 0.249705110999912, 0.2548714300000938, 0.25649742299992795, 0.24887773199998264, 0.2522605530002693, 0.2585813169998801, 0.2490264909997677, 0.2559804580000673, 0.25061231300014697, 0.2539327670001512, 0.25147581700002775, 0.24917899499996565, 0.25249531100007516, 0.25062299900037033, 0.25119203399981416, 0.25984018200006176, 0.25059449900027175, 0.2599466599999687, 0.24821165699995618, 0.25074421799990887, 0.2574191149997205, 0.2514301539999906, 0.2553798280000592, 0.2530995380002423, 0.24977733599962448, 0.24670270900014657, 0.2564127300001928, 0.26333457200007615, 0.24589979699976539, 0.2507734949999758, 0.2475807599998916, 0.2506884849999551, 0.24823069700005362, 0.2492625009999756, 0.25011981900024693, 0.2483615269998154, 0.25432030199999645, 0.24687275099995531, 0.2516041240000959, 0.255909456000154, 0.25308567299998685, 0.24905589000036343, 0.24761925300026633, 0.25222413200026494]
Number of tests: 50
The speed of "import sympy" is: 0.252426 +- 0.004169
Other comments
A test has been added to compare the pre-generated assumptions with freshly generated assumptions.
This assumes the assumptions are system independent, this has to be checked.
Release Notes
- core
- Import startup time by using pre-generated assumptions.
:white_check_mark:
Hi, I am the SymPy bot (v167). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.
Your release notes are in good order.
Here is what the release notes will look like:
- core
- Import startup time by using pre-generated assumptions. (#23851 by @eendebakpt)
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.12.
Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->
#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->
Fixes #23848
#### Brief description of what is fixed or changed
The assumption rules in `sympy.core.assumptions` have been pre-generated and stored. This improves the import time from
```
Note: the first run (warm up) was not included in the average + std dev
All runs (including warm up):
[0.2799325890000546, 0.2792433899999196, 0.2808860589998403, 0.27852462000009837, 0.2772021640003004, 0.27793970099992293, 0.27890890300022875, 0.27670108499978596, 0.2830468789998122, 0.2824971650002226, 0.27993182499994873, 0.27861428000005617, 0.2989063120003266, 0.2775266109997574, 0.29364538099980564, 0.2771660369999154, 0.286309306000021, 0.2790996429998813, 0.2776118500000848, 0.2799282989999483, 0.28268679099983274, 0.286931328999799, 0.27870470200014097, 0.2795189109997409, 0.28090279800017015, 0.2771274270003232, 0.2784352640001089, 0.2790888449999329, 0.28215103899992755, 0.28110537000020486, 0.2792655839998588, 0.28015478599991184, 0.2742751440000575, 0.277146034999987, 0.2830514439997387, 0.30372588199998063, 0.278697158000341, 0.2799442360001194, 0.2820860720003111, 0.28028372000017043, 0.27603550999992876, 0.27661651399967013, 0.27680658300005234, 0.27522678399964207, 0.2784867990003477, 0.2751314780002758, 0.2808772350003892, 0.2758764670002165, 0.2807688479997523, 0.2850936009999714, 0.2838939059997756]
Number of tests: 50
The speed of "import sympy" is: 0.280676 +- 0.005417
```
to
```
Note: the first run (warm up) was not included in the average + std dev
All runs (including warm up):
[0.26176198099983594, 0.2641488809999828, 0.25052918699975635, 0.2563463709998359, 0.25801725200017245, 0.25252887999977247, 0.24990126400007284, 0.249705110999912, 0.2548714300000938, 0.25649742299992795, 0.24887773199998264, 0.2522605530002693, 0.2585813169998801, 0.2490264909997677, 0.2559804580000673, 0.25061231300014697, 0.2539327670001512, 0.25147581700002775, 0.24917899499996565, 0.25249531100007516, 0.25062299900037033, 0.25119203399981416, 0.25984018200006176, 0.25059449900027175, 0.2599466599999687, 0.24821165699995618, 0.25074421799990887, 0.2574191149997205, 0.2514301539999906, 0.2553798280000592, 0.2530995380002423, 0.24977733599962448, 0.24670270900014657, 0.2564127300001928, 0.26333457200007615, 0.24589979699976539, 0.2507734949999758, 0.2475807599998916, 0.2506884849999551, 0.24823069700005362, 0.2492625009999756, 0.25011981900024693, 0.2483615269998154, 0.25432030199999645, 0.24687275099995531, 0.2516041240000959, 0.255909456000154, 0.25308567299998685, 0.24905589000036343, 0.24761925300026633, 0.25222413200026494]
Number of tests: 50
The speed of "import sympy" is: 0.252426 +- 0.004169
```
#### Other comments
A test has been added to compare the pre-generated assumptions with freshly generated assumptions.
This assumes the assumptions are **system independent**, this has to be checked.
#### Release Notes
<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:
* solvers
* Added a new solver for logarithmic equations.
* functions
* Fixed a bug with log of integers.
or if no release note(s) should be included use:
NO ENTRY
See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->
<!-- BEGIN RELEASE NOTES -->
* core
* Import startup time by using pre-generated assumptions.
<!-- END RELEASE NOTES -->
Update
The release notes on the wiki have been updated.
Maybe we can merge this with similar code for the new assumptions (see bin/ask_generated.py).
🟠
Hi, I am the SymPy bot (v167). I've noticed that some of your commits add or delete files. Since this is sometimes done unintentionally, I wanted to alert you about it.
This is an experimental feature of SymPy Bot. If you have any feedback on it, please comment at https://github.com/sympy/sympy-bot/issues/75.
The following commits add new files:
- d2299c490fe484b83ebd75731996c784c670f816:
sympy/core/assumptions_generated.py
If these files were added/deleted on purpose, you can ignore this message.
@asmeurer I merged the generation with bin/ask_update.py.
It turns out that the generated rules are not system independent. The logic is the same, but the ordering of the rules is different. This makes a check on whether the pre-generated rules are equal to generated rules complex. (in particular the _assume_rules.beta_triggers will be hard to compare).
I tracked down the reason for this, it turns out that Logic.fromstring('complex & !algebraic') can result in And(complex, Not(algebraic)) or in And(Not(algebraic), complex), depending on the system. The difference can be traced back to args = sorted(set(cls.flatten(bargs)), key=hash) in the class AndOr_Base (I guess the hash of the And is system dependent).
I added some sorting to the Prover to resolve this, but I am not sure this is the best approach.
Sorting sounds good. I'm assuming that's done when we pregenerate the rules and not at runtime.
Benchmark results from GitHub Actions
Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.
Significantly changed benchmark results (PR vs master)
Significantly changed benchmark results (master vs previous release)
before after ratio
[26f7bdbe] [829cc6c8]
<sympy-1.11^0>
- 956±3μs 621±0.5μs 0.65 solve.TimeSparseSystem.time_linear_eq_to_matrix(10)
- 2.78±0.01ms 1.16±0ms 0.42 solve.TimeSparseSystem.time_linear_eq_to_matrix(20)
- 5.56±0.03ms 1.71±0ms 0.31 solve.TimeSparseSystem.time_linear_eq_to_matrix(30)
Full benchmark results can be found as artifacts in GitHub Actions (click on checks at the top of the PR).
Sorting sounds good. I'm assuming that's done when we pregenerate the rules and not at runtime.
Yes, only in pre-generation. The sorting is applied to the order of arguments in And and Or.
I was thinking something like this:
from collections import defaultdict
beta_rules = _pre_calculated_assumptions['beta_rules']
defined_facts = _pre_calculated_assumptions['defined_facts']
full_implications = _pre_calculated_assumptions['full_implications']
prereq = _pre_calculated_assumptions['prereq']
beta_triggers = _pre_calculated_assumptions['beta_triggers']
def defined_facts_lines():
yield f'defined_facts = ['
for fact in defined_facts:
yield f' {fact!r},'
yield f'] # defined_facts'
def full_implications_lines():
full_implications_dict = defaultdict(dict)
for (fact, value), implications in full_implications.items():
full_implications_dict[fact][value] = implications
yield 'full_implications = {'
for fact in defined_facts:
implications = full_implications_dict[fact]
for value in (True, False):
yield f' # Implications of {fact} = {value}:'
yield f' (({fact!r}, {value!r}), ('
for implied in sorted(implications.get(value, ())):
yield f' {implied!r},'
yield f' ),'
yield f''
yield '} # full_implications'
def prereq_lines():
yield 'prereq = {'
yield ''
for fact in sorted(prereq):
yield f' # facts that could determine the value of {fact}'
yield f' {fact!r}: {{'
for pfact in sorted(prereq[fact]):
yield f' {pfact!r},'
yield ' },'
yield ''
yield '} # prereq'
def beta_rules_lines():
reverse_implications = defaultdict(list)
for pre, implied in beta_rules:
reverse_implications[implied].append(pre)
yield 'beta_rules = ['
yield ''
for implied in sorted(reverse_implications):
fact, value = implied
yield f' # Rules implying {fact} = {value}'
for pre in reverse_implications[implied]:
ruleno = beta_rules.index((pre, implied))
setstr = ", ".join(map(str, sorted(pre)))
yield f' ({{{setstr}}},'
yield f' {implied!r}), # rule number {ruleno}'
yield ''
yield '] # beta_rules'
def beta_triggers_lines():
yield 'beta_triggers = {'
for query in sorted(beta_triggers):
fact, value = query
yield f' {query!r}: {beta_triggers[query]!r},'
yield '} # beta_triggers'
def all_lines():
yield from defined_facts_lines()
yield ''
yield ''
yield from full_implications_lines()
yield ''
yield ''
yield from prereq_lines()
yield ''
yield ''
yield from beta_rules_lines()
yield ''
yield ''
yield from beta_triggers_lines()
def print_rules():
for line in all_lines():
print(line)
print_rules()
I just realised that this messes up the ordering of beta_rules though so I guess beta_triggers needs to be renumbered or something.
I was thinking something like this:
from collections import defaultdict ... ... def print_rules(): for line in all_lines(): print(line) print_rules()I just realised that this messes up the ordering of
beta_rulesthough so I guessbeta_triggersneeds to be renumbered or something.
Ok, the example above not only changes the format but in addition adds some informative comments. The indices in beta_triggers are related to the ordering of beta_rules so changing that is a bit more work.
Is there are change that changing the ordering of the rules will change behavior or performance of sympy? If so, then I would like to keep the original ordering and leave the ordering of the rules to another PR.
@oscarbenjamin
Is there are change that changing the ordering of the rules will change behavior or performance of sympy?
It shouldn't affect performance. It only matters for behaviour that the ordering is internally consistent and I think we'd find out pretty quickly if it wasn't.
I haven't tested this but I think it should work:
def beta_rules_lines():
reverse_implications = defaultdict(list)
for n, (pre, implied) in enumerate(beta_rules):
reverse_implications[implied].append((pre, n))
yield 'beta_rules = ['
yield ''
m = 0
indices = {}
for implied in sorted(reverse_implications):
fact, value = implied
yield f' # Rules implying {fact} = {value}'
for pre, n in reverse_implications[implied]:
indices[n] = m
m += 1
ruleno = beta_rules.index((pre, implied))
setstr = ", ".join(map(str, sorted(pre)))
yield f' ({{{setstr}}},'
yield f' {implied!r}), # rule number {ruleno}'
yield ''
yield '] # beta_rules'
yield 'beta_triggers = {'
for query in sorted(beta_triggers):
fact, value = query
triggers = [indices[n] for n in beta_triggers[query]]
yield f' {query!r}: {triggers!r},'
yield '} # beta_triggers'
Looks good. Thanks