Andrej
Andrej
Do we not have to include it? Do we need to change the .cu as well?
wow the diff on this PR. This is what is needed to compile C on Windows?
The const changes look good even as a general thing independent of Windows, I'd merge that regardless of Windows considerations. The makefile echo I'm not a huge fan of. Ideally...
Yeah I think this is a good idea, I have a janky uncommitted version of this but I think it makes sense to merge to master in some form.
I cannot reproduce this speedup. Also I don't fully understand it.
But which kernel is this re-utilizing from? The previous kernel is a cuBLAS matmul.
merged in 315af5f, slightly modified with comments ty. it does help about 1ms/iter on average from me, from 193ms -> 192ms. might be more on other GPUs
So I want to do this, I'm just not sure when :)
I certainly don't want to go down the IFDEF switching hell.
Hey David, sorry. recurrentjs was not really meant for production or cleanliness, it's a "are you neural nets expert? ok here's some dump of code you might like" kind of...