ajhool
ajhool
try `luarocks install nn`
> It looks like the memory usage significantly increased in the Cuda 8.0 (Feb 2017) update, while previous versions of Cuda were around 23.5% more efficient, unless the issue scales...
That's really interesting. I sensed a significant speed increase on the newer versions of Cuda and cudnn, but "sensed" is the operative word because I didn't do a controlled benchmark....
**Tried:** export OMP_NUM_THREADS=4 source ~/.bashrc **Result:** No apparent change
> BTW I don't understand "4x 4GB vCPU" "16GB total". My cores all share the same RAM and I fail to see how it could be otherwise. I was just...
@htoyryla > Using -print_iter 1 was not enough to cause a bottleneck. GPU util was almost 100% all the time, CPU util around 100%. Would you mind clarifying that? Isn't...
Ah, yes that makes total sense, thanks. I think you understand my problem perfectly. I originally thought that the CPU was only/predominantly being utilized for saving iterations, so that's why...
I believe autotune works by taking a lot of memory on the first pass to reduce the memory on subsequent passes. Perhaps the spike in the beginning is causing a...
I'm finding that `require cudnn` on a volta takes 10 minutes. @clement-masson , any idea how I can profile the require function to see what exactly is taking so long...
@nagadomi , I'm using your distro with cuda9/10 support. Any ideas why the bindings might be struggling with the Volta architecture?