sympy icon indicating copy to clipboard operation
sympy copied to clipboard

Fixing bugs reported in the new matrix derivation algorithm

Open Upabjojr opened this issue 2 years ago • 13 comments

  • fixes https://github.com/sympy/sympy/issues/24021
  • fixes https://stackoverflow.com/questions/69874089/calculating-the-gradient-and-hessian-of-a-symbolic-function-containing-matrices/69876854#69876854

Follows from discussion: https://github.com/sympy/sympy/pull/22476#issuecomment-1546875971

NO ENTRY

Upabjojr avatar May 16 '23 16:05 Upabjojr

:white_check_mark:

Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

  • No release notes entry will be added for this pull request.
Click here to see the pull request description that was parsed.
* fixes https://github.com/sympy/sympy/issues/24021
* fixes https://stackoverflow.com/questions/69874089/calculating-the-gradient-and-hessian-of-a-symbolic-function-containing-matrices/69876854#69876854

Follows from discussion: https://github.com/sympy/sympy/pull/22476#issuecomment-1546875971

<!-- BEGIN RELEASE NOTES -->
NO ENTRY
<!-- END RELEASE NOTES -->

sympy-bot avatar May 16 '23 17:05 sympy-bot

Benchmark results from GitHub Actions

Lower numbers are good, higher numbers are bad. A ratio less than 1 means a speed up and greater than 1 means a slowdown. Green lines beginning with + are slowdowns (the PR is slower then master or master is slower than the previous release). Red lines beginning with - are speedups.

Significantly changed benchmark results (PR vs master)


Significantly changed benchmark results (master vs previous release)


Full benchmark results can be found as artifacts in GitHub Actions (click on checks at the top of the PR).

github-actions[bot] avatar May 16 '23 18:05 github-actions[bot]

@oscarbenjamin feel free to review :)

Upabjojr avatar May 16 '23 18:05 Upabjojr

This possibly fixes some of the examples in gh-23492 but the final one fails:

In [29]: import sympy as sp
    ...: A=sp.MatrixSymbol('A', 3, 3)
    ...: sp.pprint(sp.trace(A).diff(A))
    ...: 
    ...: sp.pprint(sp.trace(A).as_explicit().diff(A))
𝕀
---------------------------------------------------------------------------
AttributeError: 'MatrixElement' object has no attribute 'shape'

oscarbenjamin avatar May 17 '23 12:05 oscarbenjamin

See also gh-23364:

In [38]: x = symbols('x')

In [39]: F = MatrixSymbol('F', 3, 3)

In [40]: ((x*F)**2).diff(x)
---------------------------------------------------------------------------
TypeError 

oscarbenjamin avatar May 17 '23 12:05 oscarbenjamin

See also gh-21597:

In [66]: b = MatrixSymbol('b', 3, 1)

In [67]: c = MatrixSymbol('c', 3, 1)

In [68]: F = MatrixSymbol('F', 3, 3)

In [69]: Derivative(b - F*c, b).doit()
---------------------------------------------------------------------------
TypeError

oscarbenjamin avatar May 17 '23 12:05 oscarbenjamin

Also https://github.com/sympy/sympy/issues/17229#issuecomment-591863771

oscarbenjamin avatar May 17 '23 12:05 oscarbenjamin

This PR also gives a new error for the example in gh-15651:

In [1]: A = MatrixSymbol('A', 3, 3)

In [2]: diff(A.as_explicit()*A, A)
...
SymPyDeprecationWarning: 

Using non-Expr arguments in Mul is deprecated (in this case, one of
the arguments has type 'ImmutableDenseNDimArray').
...
   1052 # Check that no contraction happens when the shape is mismatched:
   1053 for i in contraction_indices:
-> 1054     if len({shape[j] for j in i if shape[j] != -1}) != 1:
   1055         raise ValueError("contracting indices of different dimensions")

IndexError: tuple index out of range

oscarbenjamin avatar May 17 '23 12:05 oscarbenjamin

OK, I'll try to address all these problems in the next days.

Upabjojr avatar May 17 '23 14:05 Upabjojr

One big question arises: what is the shape of nested array/matrix-like objects? For example, a matrix whose elements are matrix symbols should be shape (n, n) or (n, n, n, n)?

Upabjojr avatar May 19 '23 21:05 Upabjojr

what is the shape of nested array/matrix-like objects? For example, a matrix whose elements are matrix symbols should be shape (n, n) or (n, n, n, n)?

This kind of thing is why I find it difficult to fix code in this area myself. I often don't have a clear idea of what the right answer is supposed to be. This is one possible answer:

In [5]: X = MatrixSymbol('X', 3, 3)

In [6]: M = Matrix([[X, X], [X, X]])

In [7]: M
Out[7]: 
⎡X₀₀  X₀₁  X₀₂  X₀₀  X₀₁  X₀₂⎤
⎢                            ⎥
⎢X₁₀  X₁₁  X₁₂  X₁₀  X₁₁  X₁₂⎥
⎢                            ⎥
⎢X₂₀  X₂₁  X₂₂  X₂₀  X₂₁  X₂₂⎥
⎢                            ⎥
⎢X₀₀  X₀₁  X₀₂  X₀₀  X₀₁  X₀₂⎥
⎢                            ⎥
⎢X₁₀  X₁₁  X₁₂  X₁₀  X₁₁  X₁₂⎥
⎢                            ⎥
⎣X₂₀  X₂₁  X₂₂  X₂₀  X₂₁  X₂₂⎦

Whatever the answer is it should be consistent with the above or the above should be changed to be consistent with it.

oscarbenjamin avatar May 19 '23 22:05 oscarbenjamin

Whatever the answer is it should be consistent with the above or the above should be changed to be consistent with it.

The problem is that such a change is not at all trivial.

Upabjojr avatar May 19 '23 22:05 Upabjojr

The problem is that such a change is not at all trivial.

Agreed. Getting these definitions pinned down and documented is more important than anything else though. If we are clear what the correct behaviour should be then fixing it might be difficult but at least we can move in the right direction.

oscarbenjamin avatar May 19 '23 22:05 oscarbenjamin