sympy
sympy copied to clipboard
let decompose_power handle base reduction
References to other Issues or PRs
#5149
Brief description of what is fixed or changed
By decomposing a Rational base into a perfect power, cases where two generators were formerly identified by Poly will now give one:
>>> Poly(4**x+2**x)
Poly((2**x)**2 + (2**x), 2**x, domain='ZZ') <-- formerly two generators 2**x and 4**x
Other comments
While Poly
calls decompose_power_rat
, poly
does not, so the latter needed editing, too.
It was nice to see the continuous random variable test simplify from
2*531441**(-k)*sqrt(k)*theta*(3*3**(12*k) - 2*531441**k) /(sqrt(k**2 + 18)*sqrt(theta**2 + 18))
to
2*sqrt(k)*theta/(sqrt(k**2 + 18)*sqrt(theta**2 + 18))
When the 531441**(-k)*(3*3**(12*k) - 2*531441**k)
is recognized as 1
Release Notes
- core
-
exprtools.decompose_power
identifies perfect power in Rational bases sodecompose_power(4**x)->(2**x, 2)
-
- polys
-
Poly
andpoly
should both make powers with Rational bases canonical, e.g.2**(2*x)
instead of4**x
-
:white_check_mark:
Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.
Your release notes are in good order.
Here is what the release notes will look like:
-
core
-
polys
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.13.
Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->
#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->
#5149 #19776
#### Brief description of what is fixed or changed
By decomposing a Rational base into a perfect power, cases where two generators were formerly identified by Poly will now give one:
```python
>>> Poly(4**x+2**x)
Poly((2**x)**2 + (2**x), 2**x, domain='ZZ') <-- formerly two generators 2**x and 4**x
```
#### Other comments
While `Poly` calls `decompose_power_rat`, `poly` does not, so the latter needed editing, too.
It was nice to see the continuous random variable test simplify from
```python
2*531441**(-k)*sqrt(k)*theta*(3*3**(12*k) - 2*531441**k) /(sqrt(k**2 + 18)*sqrt(theta**2 + 18))
```
to
```python
2*sqrt(k)*theta/(sqrt(k**2 + 18)*sqrt(theta**2 + 18))
```
When the `531441**(-k)*(3*3**(12*k) - 2*531441**k)` is recognized as 1
#### Release Notes
<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:
* solvers
* Added a new solver for logarithmic equations.
* functions
* Fixed a bug with log of integers.
or if no release note(s) should be included use:
NO ENTRY
See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->
<!-- BEGIN RELEASE NOTES -->
* core
* `exprtools.decompose_power` identifies perfect power in Rational bases so `decompose_power(4**x)->(2**x, 2)`
* polys
* `Poly` and `poly` should both make powers with Rational bases canonical, e.g. `2**(2*x)` instead of `4**x`
<!-- END RELEASE NOTES -->
failing: File "/simplify/tests/test_hyperexpand.py", line 687, in test_prudnikov_misc: assert can_do([a], [a - S.Half, 2*a + 1])
Any objections here?
I need to think about this a bit but it seems good. It also extends the capability of factor
:
In [1]: factor(4**x - 1) # master
Out[1]:
x
4 - 1
In [3]: factor(4**x - 1) # PR
Out[3]:
⎛ x ⎞ ⎛ x ⎞
⎝2 - 1⎠⋅⎝2 + 1⎠
This I'm less sure about:
In [3]: factor(4**x + 1)
Out[3]:
2⋅x
2 + 1
Maybe that's okay since factor
generally puts expressions into a particular canonical form (e.g. expands all powers etc).
Presumably other polynomial functions like gcd, cancel, quo etc are affected:
In [1]: gcd(2**x - 1, 4**x - 1) # master
Out[1]: 1
In [12]: gcd(2**x - 1, 4**x - 1) # PR
Out[12]:
x
2 - 1
I think this is the "sweet spot" to put this handling. It's a very targeted transformation and, as you note, any routine that uses this to put expressions in a generator-canonical form will benefit. Please commit when ready.
I think there should be more exploration of how this changes things and more tests for the different functions that would be affected.
It would probably be good to go further than this and handle things like 2**(x/2)
. I don't propose to make that change here (the same problem exists for sqrt(x)
) but since the code will likely be changed in future it's important that there are tests for any functionality that this enables.
Another place affected is construct_domain
:
In [8]: construct_domain([4**x, 2**x]) # master
Out[8]: (EX, [EX(4**x), EX(2**x)])
In [1]: construct_domain([4**x, 2**x]) # PR
Out[1]: (ZZ[2**x], [(2**x)**2, (2**x)])
handle things like 2**(x/2)
please elaborate. Poly(4**(x/2))
should become Poly(2**x, 2**x)
handle things like 2**(x/2)
please elaborate.
Poly(4**(x/2))
should becomePoly(2**x, 2**x)
More like this:
In [4]: Poly(2**x + 4**x)
Out[4]: Poly((2**x)**2 + (2**x), 2**x, domain='ZZ')
In [5]: Poly(2**x + 2**(x/2))
Out[5]: Poly((2**(x/2)) + (2**x), 2**(x/2), 2**x, domain='ZZ')
In [6]: Poly(x + x**2)
Out[6]: Poly(x**2 + x, x, domain='ZZ')
In [7]: Poly(sqrt(x) + x)
Out[7]: Poly(x + (sqrt(x)), x, sqrt(x), domain='ZZ')
This leads to:
In [9]: e = (sqrt(x) - 1)*(sqrt(x) - 2)
In [10]: e
Out[10]: (√x - 2)⋅(√x - 1)
In [11]: expand(e)
Out[11]: -3⋅√x + x + 2
In [12]: factor(_)
Out[12]: -3⋅√x + x + 2
Similarly with this PR:
In [13]: e = (2**x - 1)*(2**x - 2)
In [14]: expand(e)
Out[14]:
2⋅x x
2 - 3⋅2 + 2
In [15]: factor(_)
Out[15]:
⎛ x ⎞ ⎛ x ⎞
⎝2 - 2⎠⋅⎝2 - 1⎠
In [16]: e = (2**(x/2) - 1)*(2**(x/2) - 2)
In [17]: expand(e)
Out[17]:
x
─
2 x
- 3⋅2 + 2 + 2
In [18]: factor(_)
Out[18]:
x
─
2 x
- 3⋅2 + 2 + 2
I haven't thought through in detail what's the best way to handle this. As I said though it's probably best not to try to include that in this PR. Here all that's needed is good test coverage for the effects of the change already made.
I just looked at where this function is called. It's very low level:
- decompose_power
- as_terms
- as_ordered_terms
- traversal
- epathtools
- printer
- formal
- manual_integrate
- sort_key
- as_ordered_terms
-
_parallel_dict_from_expr_*_gens
- *dict_from_expr
- minpoly
- Monomial
- *dict_from_expr
- as_terms
Searching in the repo via github, in the 9 pages of results I can't find where Monomial is ever called...hmm. I'm not sure about writing tests for anything higher than Poly -- I can't imagine that we will soon stop using Poly for low level manipulation of expression. I can try follow your example and just write some for the expression-wrangling functions that we have. But all they will be reflecting is that Poly tracks its generator(s) -- so factoring 2**x + 4**x
is no different than factoring y + y**2
.
More like this:
Oh, ok. I'm familiar with that. In solve
we look for generators that are powers of others. That is something that can only be determined in context of all generators. There is no way that decompose_power
knows in what context its generator is being used. The benefits to Poly from this PR are a consequence of having the Rational base powers have a canonical base. I can look into this for a future PR.
Anything that changes how Poly
chooses its generators or how construct_domain
decides on a computable ring will affect all high-level functions that use Poly
for their internal work. A change like that needs a lot of tests.
A shortlist of functions to consider: Poly, poly, as_poly, div, rem, quo, exquo, invert, gcd, primitive, content, sqf, factor, cancel, construct_domain, apart.
A change like that needs a lot of tests.
I'm going to write the tests, but my point is that there is no difference to Poly-related functions whether the generators are x
, sin(x)
, or 2**x
. So targetting Poly
, poly
(which was caught be a test already) and as_poly
seem to be the sufficient targets for tests unless any of those other functions you listed uses a method other than these 3 to construct the primitives with which the work is done. I will look into that.
breadcrumb: this should pass but doesn't -- more work needs to be done with poly
def test_rational_base_generator():
for a, b in [(2**x, 4**x), (x, x**2)]:
p = Poly(a + b)
assert poly(a + b) == p
assert (a + b).as_poly() == p
assert p.gens == (a,)
# give gen a
assert Poly(a + b, a) == p
assert poly(a + b, a) == p
assert (a + b).as_poly(a) == p
# give gen b
raises(PolynomialError, lambda: Poly(a + b, b))
raises(PolynomialError, lambda: poly(a + b, b))
assert (a + b).as_poly(b) is None
Benchmark results from GitHub Actions
Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with +
are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with -
are speedups.
Significantly changed benchmark results (PR vs master)
Significantly changed benchmark results (master vs previous release)
before after ratio
[41d90958] [a380cfaa]
<sympy-1.11.1^0>
- 968±2μs 621±0.9μs 0.64 solve.TimeSparseSystem.time_linear_eq_to_matrix(10)
- 2.78±0.01ms 1.17±0ms 0.42 solve.TimeSparseSystem.time_linear_eq_to_matrix(20)
- 5.52±0.01ms 1.72±0ms 0.31 solve.TimeSparseSystem.time_linear_eq_to_matrix(30)
Full benchmark results can be found as artifacts in GitHub Actions (click on checks at the top of the PR).
I am not aware of anything more that is needed for this. The test suite does not give much feedback with this change -- only one test was changed for the better. Unless someone can show that this is broken, I will commit and let the larger community's use-cases drive further development.
This fails:
In [10]: ZZ[2**x].from_sympy(4**x)
---------------------------------------------------------------------------
CoercionFailed: expected an integer, got 4**x
...
ValueError: expected an expression convertible to a polynomial in Polynomial ring in 2**x over ZZ with lex order, got 4**x
This breaks a key invariant which is that the domain returned by construct_domain
should always be able to convert the expressions it was constructed for:
In [20]: vals = [2**x, 4**x]
In [21]: dom, domvals = construct_domain(vals)
In [22]: assert domvals == [dom.from_sympy(v) for v in vals]
---------------------------------------------------------------------------
CoercionFailed
I'm also unsure about turning 4**x
into 2**(2*x)
everywhere:
In [27]: cancel(4**x)
Out[27]:
2⋅x
2
Maybe that's a good thing but I'm not sure.
This breaks a key invariant
I am guessing that I will have to dig into from_sympy
to fix this.
I'm also unsure about turning
4**x
into2**(2*x)
everywhere:
Possibly having to_sympy/as_expr
automatically convert this would help. I'm not sure whether 4**x
or 2**(2*x)
should be preferred in general. It seems unfortunate though to change this if there isn't any reason to do so.
if there isn't any reason to do so.
The motivation to do so is to make Rational-based powers behave the same as symbolic bases:
>>> [len(Poly((x+x**2).simplify()).gens) for x in (sin(x), 2**x)] # simplify to make (2**x)**2 -> 4**x
[1, 2]
This has implications for (as you noted) factoring; it also affects solving equations and removes the onus of looking for these redundant generators from solve
to Poly
.
Maybe this would be better as a type of simplification: tbd(2**x + 4**x) -> 2**x + 2**(2*x)
?
The motivation to do so is to make Rational-based powers behave the same as symbolic bases:
If that's the goal then it's also possible to go further and factorise other things e.g.
In [1]: 2**x*3**x
Out[1]:
x
6
In [2]: factor(2**x*3**x - 2**x)
Out[2]:
x x
- 2 + 6
There's definitely a question of how far we should go here because fully factorising integers can be an expensive operation. Another possibility is that this could be a Poly option or at least that it could be possible to specify these as generators:
In [3]: factor(2**x*3**x - 2**x, [2**x, 3**x])
Out[3]:
x x
- 2 + 6
In [4]: Poly(2**x*3**x - 2**x, [2**x, 3**x])
Out[4]: Poly(-(2**x) + 6**x, 2**x, 3**x, domain='ZZ[6**x]')
Yes, the factorization bothers me, too. Perhaps the better thing to do is attempt a unification of generators that are powers in Poly. Finding gcd is easy compared to factorization.
Maybe this is similar to this problem: https://github.com/sympy/sympy/issues/22852#issuecomment-1012640723