cutde
cutde copied to clipboard
Add CUDA matrix-vector product functions that use the matrices output by `disp_block` and `disp_aca`.
Iterating over the blocks from Python is quite inefficient. See here: https://tbenthompson.com/book/tdes/hmatrix.html#a-matrix-vector-product