Julian Regalado Perez

Results 3 comments of Julian Regalado Perez

Are there still plans for this? Would pull requests be considered? I assume something simple to start with is add another kad_sgemm_simple that wraps cublasSgemm just like there is the...

It would be interesting to hear what @attractivechaos thinks about this. I tested cublas_sgemm VS cblas_sgemm and as expected you only start getting performance gains on very large matrices. Too...

Indeed, the transfer between host and device is massive (99% of the time in my system). The actual computation of sgemm is insanely fast. I have attached a file that...