occur "cudamat.cudamat.CUDAMatException: CUBLAS error." when running multimodal_dbm example
hi nitish srivastava
@nitishsrivastava
I have problems when running multimodal_dbm example, like this:
Train Step: 0Traceback (most recent call last):
File "/home/meitu299/deepnet/deepnet/trainer.py", line 60, in
and the RAM is 8G and gpu memory is 3G in my computer ,CUDA6.0 I follow your INSTALL, but always happen this could tell how to resolve this ? It's a bug? thx
Try reduce batch size from 128 to 100.
I have try to reduce batch size to 50, but it doesn't work
Try to fix the value "gpu_memory" of your .pbtxt file to "2G" or "2.5G"
thanks , that's OK
thanks to you in advanved i have the similar problem. when i run the example of ff,i set the steps from 1000000 to 10000, the batchsize from 100 to 10,the gpu_memory from 2G to 0.1G,the main_memmory from 4G to 0.7G. but when i come to the setp 499, it still comes to the problem like this:
File "/home/tbq/Downloads/deepnet-master/deepnet/softmax_layer.py", line 65, in GetLoss perf.correct_preds = temp.sum() File "/home/tbq/Downloads/deepnet-master/cudamat/cudamat.py", line720, in sum return vdot(self,CUDAMatrix.ones.slice(0,self.shape[0]*self.shape[1])) File "/home/tbq/Downloads/deepnet-master/cudamat/cudamat.py", line1650 in vdot raise generate_exception(err_code.value) cudamat.cudamat.CUDAMatException: CUBLAS error.
and the RAM is 1G and gpu memory is 256M in my computer ,CUDA5.5
when i try the dbm and rbm ,it is also comes to the problem i want to know whether my cpu and gpu is not satisfy the demand. thx
sorry,english is not mother tongue. in addition,gcc:4.6.3
In my case, I decreased gpu_mem as 1G in run_all_dbn.sh though my gpu memory is 4G (NVIDIA GeForce GTX 780M 4096 MB).
thank you all , ruducing the gpu_mem really helps , and the code strat to work , but at the end of trainning the first layer , the bug happens again , is the gpu_mem still too large? and what will happen if i reduce the gpu_mem
Thank you all a lot if anyone can help me