Fabian Tschopp

Results 119 comments of Fabian Tschopp

@soumith Great! I'll make sure to finish my OpenCL cuDNN replacement before initial benchmarking. Can I ask to include also AMD and Intel GPU (+CPU) benchmarks (where applicable and possible)?...

@soumith A single buffer can't be bigger than 1/4th of the GPUs memory on current AMD GPUs. I think the Fury Nano is not the right candidate to benchmark with....

My idea to include a U-Net like architecture also ties in with @moloned 's suggestion.

@hughperkins If we limit the memory usage to 8 GB the W9100 should perform similarly to R9 390X, which have even higher clock speeds than a W9100 and are clocking...

@soumith Oh right. My fault on the price conversion. Corrected it above.

@hughperkins Thanks for including my efforts :) not quite there yet with performance (writing the autotuning code at this instant). I think this is a very nice idea. At least...

@NH89 Greentea/OpenCL is really slow for CNNs with batched data because of overhead and inefficiency in the Matrix-Matrix multiplications used for convolutions, especially when they are smaller. This benchmark also...

@NH89 No problem. Probably the OpenCL approaches will catch up with CUDA solutions during Q2/3 next year, as major developments are going on by both AMD and Intel. For my...

@sat8 https://github.com/naibaf7/caffe Has experimental int8 kernels for both CUDA and OpenCL if you're still interested to play with this. Edit: I have to mention you'll not have the greatest time...

@orionr ROCm is not OpenCL. This will not work on any other devices than AMDGPU-PRO. It's based on HIP, an AMD drop-in replacement for CUDA.