faster_rcnn icon indicating copy to clipboard operation
faster_rcnn copied to clipboard

Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED

Open martinkersner opened this issue 9 years ago • 2 comments

Hi,

I am trying to retrain ZF or VGG model using only two categories from VOC2007 (dog and cat). I have cleaned Annotations and ImageSets directories from other categories and left only necessary images in JPEGImages directory. I have changed outputs of layers in particular .prototxt files from 84 = 4(20 categories + 1 background) to 12 = 4*(2 categories + 1 background).

The first step of training (Train RPN with conv layers tuned; compute RPN results on the train/test sets) is performed without any problems. However when the second step starts training fails with error in math_functions.cu Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED. I have found out that the problem is caused in either caffe_gpu_axpy() or caffe_gpu_scal() only when parameter N has larger value. I tried to decrease values of fast rcnn settings in fast_rcnn_config.m, but the same error still persists.

Any idea what am I doing wrong? :)

I am using Ubuntu 14.04.3 LTS, Matlab R2012a and Tesla K40m.

martinkersner avatar Oct 20 '15 09:10 martinkersner

@martinkersner Besides 84 = 4(20 categories + 1 background) to 12 = 4*(2 categories + 1 background), the final layer 's output dim should modified from 84->12, 21 -> 3 for regression and classification. Please make sure you can pass check_gpu_memory() in Line89, fast_rcnn_train.

If no problem with check_gpu_memory(), this means your prototxt is correct.

Then please make sure the class id in imdb/roidb is only with 1 or 2, if you are using two classes.

ShaoqingRen avatar Oct 21 '15 04:10 ShaoqingRen

F1030 16:15:43.990840 1937 math_functions.cu:28] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED

The above error will occur when installing CUDA9.0 By installing patchesPatch 2 (Released Mar 5, 2018)solve

asa008 avatar Mar 05 '19 13:03 asa008