sympy
sympy copied to clipboard
Fixing bugs reported in the new matrix derivation algorithm
- fixes https://github.com/sympy/sympy/issues/24021
- fixes https://stackoverflow.com/questions/69874089/calculating-the-gradient-and-hessian-of-a-symbolic-function-containing-matrices/69876854#69876854
Follows from discussion: https://github.com/sympy/sympy/pull/22476#issuecomment-1546875971
NO ENTRY
:white_check_mark:
Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.
- No release notes entry will be added for this pull request.
Click here to see the pull request description that was parsed.
* fixes https://github.com/sympy/sympy/issues/24021
* fixes https://stackoverflow.com/questions/69874089/calculating-the-gradient-and-hessian-of-a-symbolic-function-containing-matrices/69876854#69876854
Follows from discussion: https://github.com/sympy/sympy/pull/22476#issuecomment-1546875971
<!-- BEGIN RELEASE NOTES -->
NO ENTRY
<!-- END RELEASE NOTES -->
Benchmark results from GitHub Actions
Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.
Significantly changed benchmark results (PR vs master)
Significantly changed benchmark results (master vs previous release)
Full benchmark results can be found as artifacts in GitHub Actions (click on checks at the top of the PR).
@oscarbenjamin feel free to review :)
This possibly fixes some of the examples in gh-23492 but the final one fails:
In [29]: import sympy as sp
...: A=sp.MatrixSymbol('A', 3, 3)
...: sp.pprint(sp.trace(A).diff(A))
...:
...: sp.pprint(sp.trace(A).as_explicit().diff(A))
𝕀
---------------------------------------------------------------------------
AttributeError: 'MatrixElement' object has no attribute 'shape'
See also gh-23364:
In [38]: x = symbols('x')
In [39]: F = MatrixSymbol('F', 3, 3)
In [40]: ((x*F)**2).diff(x)
---------------------------------------------------------------------------
TypeError
See also gh-21597:
In [66]: b = MatrixSymbol('b', 3, 1)
In [67]: c = MatrixSymbol('c', 3, 1)
In [68]: F = MatrixSymbol('F', 3, 3)
In [69]: Derivative(b - F*c, b).doit()
---------------------------------------------------------------------------
TypeError
Also https://github.com/sympy/sympy/issues/17229#issuecomment-591863771
This PR also gives a new error for the example in gh-15651:
In [1]: A = MatrixSymbol('A', 3, 3)
In [2]: diff(A.as_explicit()*A, A)
...
SymPyDeprecationWarning:
Using non-Expr arguments in Mul is deprecated (in this case, one of
the arguments has type 'ImmutableDenseNDimArray').
...
1052 # Check that no contraction happens when the shape is mismatched:
1053 for i in contraction_indices:
-> 1054 if len({shape[j] for j in i if shape[j] != -1}) != 1:
1055 raise ValueError("contracting indices of different dimensions")
IndexError: tuple index out of range
OK, I'll try to address all these problems in the next days.
One big question arises: what is the shape of nested array/matrix-like objects? For example, a matrix whose elements are matrix symbols should be shape (n, n) or (n, n, n, n)?
what is the shape of nested array/matrix-like objects? For example, a matrix whose elements are matrix symbols should be shape
(n, n)or(n, n, n, n)?
This kind of thing is why I find it difficult to fix code in this area myself. I often don't have a clear idea of what the right answer is supposed to be. This is one possible answer:
In [5]: X = MatrixSymbol('X', 3, 3)
In [6]: M = Matrix([[X, X], [X, X]])
In [7]: M
Out[7]:
⎡X₀₀ X₀₁ X₀₂ X₀₀ X₀₁ X₀₂⎤
⎢ ⎥
⎢X₁₀ X₁₁ X₁₂ X₁₀ X₁₁ X₁₂⎥
⎢ ⎥
⎢X₂₀ X₂₁ X₂₂ X₂₀ X₂₁ X₂₂⎥
⎢ ⎥
⎢X₀₀ X₀₁ X₀₂ X₀₀ X₀₁ X₀₂⎥
⎢ ⎥
⎢X₁₀ X₁₁ X₁₂ X₁₀ X₁₁ X₁₂⎥
⎢ ⎥
⎣X₂₀ X₂₁ X₂₂ X₂₀ X₂₁ X₂₂⎦
Whatever the answer is it should be consistent with the above or the above should be changed to be consistent with it.
Whatever the answer is it should be consistent with the above or the above should be changed to be consistent with it.
The problem is that such a change is not at all trivial.
The problem is that such a change is not at all trivial.
Agreed. Getting these definitions pinned down and documented is more important than anything else though. If we are clear what the correct behaviour should be then fixing it might be difficult but at least we can move in the right direction.