warp icon indicating copy to clipboard operation
warp copied to clipboard

[DOCS] Add caveats for the adjoint of C in wp.matmul()

Open daedalus5 opened this issue 2 months ago • 0 comments

Category

  • [ ] Report an error in the documentation.
  • [x] Request for something to be documented.
  • [ ] Suggestion to improve the documentation.
  • [ ] Other (please explain)

Description

Document the differentiability nuances of wp.tile_matmul, in particular the bias term.

Because we replace assign C in C = alpha A * B + beta * C

the adjoint of C must be handled with care in gradient calculations. The result will only be correct if C is passed to linear functions.

In nonlinear graphs, matrix multiplication may be re-written using other builtins, but it will be slower.

daedalus5 avatar Oct 23 '25 16:10 daedalus5