rocSOLVER icon indicating copy to clipboard operation
rocSOLVER copied to clipboard

Hybrid cpu host + gpu device execution of bdsqr()

Open EdDAzevedo opened this issue 1 year ago • 0 comments

Hybrid cpu host + gpu device execution of bdsqr() in an attempt to speed up SVD calculations.

bdsqr_host() is heavily influenced by lapack routine dbdsqr() where the "D" and "E" arrays are reduced on the CPU. The rotations are then copied to GPU to update the "V", "U", "C" arrays.

A special case is if no rotations are needed for "V", "U", "C" arrays, then the lapack version of bdsqr() is called directly.

EdDAzevedo avatar Jun 17 '24 20:06 EdDAzevedo