py-videocore
py-videocore copied to clipboard
Implementation in C
I've tried the python example codes on my RPi2 and the multithreaded sgemm computation time is really amazing.
I wonder if there is sgemm implementation in C such that it can be used in other applications coded in C that requires matrix computation heavily (e.g. convolution in computer vision)