nengo-ocl
nengo-ocl copied to clipboard
Improve Batched GEMV Speed
Using column-major instead of row-major matrices is found to improve the speed of GEMV in nearly all cases, with improvements in some practical scenarios, such as Spaun 2.0.
Also, the LIF kernel is updated to use the more accurate LIF model in Nengo 2.1.1. This is found to have negligible performance penalties.
The current, non autotuned colmun-major kernel is able to achieve a total of 31 s of Sim time speedup on Spaun 2.0 (202s -> 171s) on a GTX 970.