PyCudaHandler: compilation fails - identifier "int32_t" is undefined
On Windows 10 64-bit; PyCUDA 2015.1.3; CUDA 7.5 machine:
When calling PyCudaHandler while running CIFAR-10 example (examples/cifar10_cnn.py) line 64, the following error occurs:
- - - - - - - - - - Before Training - - - - - - - - - - Traceback (most recent call last): File "C:\Anaconda3\lib\site-packages\pycuda\tools.py", line 426, in context_dependent_memoize return ctx_dict[cur_ctx][args] KeyError: <pycuda._driver.Context object at 0x000000001BE82208>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\training\trainer.py", line 162, in _call_hook
stepper=self.stepper, logs=self.logs), False
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\hooks.py", line 422, in call
return evaluate(net, self.iter, self.scorers)
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\tools.py", line 77, in evaluate
network.forward_pass()
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\structure\network.py", line 431, in forward_pass
layer.forward_pass(self.buffer[layer_name], training_pass)
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\layers\convolution_layer_2d.py", line 89, in forward_pass
self.padding, self.stride)
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\handlers\pycuda_handler.py", line 256, in conv2d_forward_batch
self.add_mv(flat_outputs, bias, flat_outputs)
File "C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\handlers\pycuda_handler.py", line 124, in add_mv
cumisc.add_matvec(m, v, out=out)
File "C:\Anaconda3\lib\site-packages\skcuda\misc.py", line 1114, in add_matvec
return binaryop_matvec('+', x_gpu, a_gpu, axis, out, stream)
File "C:\Anaconda3\lib\site-packages\skcuda\misc.py", line 921, in binaryop_matvec
row_kernel, col_kernel = _get_binaryop_vecmat_kernel(x_gpu.dtype, binary_op)
File "
kernel.cu(6): error: identifier "int32_t" is undefined
2 errors detected in the compilation of "C:/Users/Brian/AppData/Local/Temp/tmpxft_000010a8_00000000-8_kernel.cpp1.ii". ]
An error occurred while calling the "validation" hook:
KeyError Traceback (most recent call last) C:\Anaconda3\lib\site-packages\pycuda\tools.py in context_dependent_memoize(func, *args) 425 try: --> 426 return ctx_dict[cur_ctx][args] 427 except KeyError:
KeyError: <pycuda._driver.Context object at 0x000000001BE82208>
During handling of the above exception, another exception occurred:
CompileError Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\training\trainer.py in train(self, net, training_data_iter, **named_data_iters) 81 named_data_iters['training_data_iter'] = training_data_iter 82 self._start_hooks(net, named_data_iters) ---> 83 if self._emit_hooks(net, 'update') or self._emit_hooks(net, 'epoch'): 84 return 85
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\training\trainer.py in _emit_hooks(self, net, timescale, logs) 148 continue 149 --> 150 hook_log, stop = self._call_hook(hook, net) 151 should_stop |= stop 152 self._add_log(name, hook_log, hook.verbose, logs=logs)
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\training\trainer.py in _call_hook(self, hook, net) 167 .format(hook.name), file=sys.stderr) 168 print(traceback.format_exc()) --> 169 raise e 170 171 def _add_log(self, name, val, verbose=None, logs=None, indent=0):
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\training\trainer.py in _call_hook(self, hook, net) 160 update_nr=self.current_update_nr, 161 net=net, --> 162 stepper=self.stepper, logs=self.logs), False 163 except StopIteration as err: 164 return getattr(err, 'value', None), True
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\hooks.py in call(self, epoch_nr, update_nr, net, stepper, logs) 420 421 def call(self, epoch_nr, update_nr, net, stepper, logs): --> 422 return evaluate(net, self.iter, self.scorers) 423 424
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\tools.py in evaluate(network, iter, scorers, out_name, targets_name, mask_name) 75 76 for _ in run_network(network, iterator): ---> 77 network.forward_pass() 78 gather_losses_and_scores( 79 network, scorers, scores, out_name=out_name,
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\structure\network.py in forward_pass(self, training_pass, context) 429 self._buffer_manager.apply_context(context) 430 for layer_name, layer in list(self.layers.items())[1:]: --> 431 layer.forward_pass(self.buffer[layer_name], training_pass) 432 433 def backward_pass(self):
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\layers\convolution_layer_2d.py in forward_pass(self, buffers, training_pass) 87 # calculate outputs 88 _h.conv2d_forward_batch(flat_inputs, W, bias, flat_outputs, ---> 89 self.padding, self.stride) 90 _h.inplace_act_funcself.activation 91
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\handlers\pycuda_handler.py in conv2d_forward_batch(self, inputs, params, bias, outputs, padding, stride) 254 255 flat_outputs = flatten_all_but_last(outputs) --> 256 self.add_mv(flat_outputs, bias, flat_outputs) 257 258 def dot_add_mm(self, a, b, out, transa=False, transb=False):
C:\Anaconda3\lib\site-packages\brainstorm-0.5b1-py3.4-win-amd64.egg\brainstorm\handlers\pycuda_handler.py in add_mv(self, m, v, out) 122 123 def add_mv(self, m, v, out): --> 124 cumisc.add_matvec(m, v, out=out) 125 126 def add_st(self, s, t, out):
C:\Anaconda3\lib\site-packages\skcuda\misc.py in add_matvec(x_gpu, a_gpu, axis, out, stream) 1112 """ 1113 -> 1114 return binaryop_matvec('+', x_gpu, a_gpu, axis, out, stream) 1115 1116
C:\Anaconda3\lib\site-packages\skcuda\misc.py in binaryop_matvec(binary_op, x_gpu, a_gpu, axis, out, stream) 919 raise ValueError('invalid operator') 920 --> 921 row_kernel, col_kernel = _get_binaryop_vecmat_kernel(x_gpu.dtype, binary_op) 922 n, m = np.int32(x_gpu.shape[0]), np.int32(x_gpu.shape[1]) 923
C:\Anaconda3\lib\site-packages\skcuda\misc.py in _get_binaryop_vecmat_kernel(dtype, binary_op)
C:\Anaconda3\lib\site-packages\pycuda\tools.py in context_dependent_memoize(func, _args) 428 context_dependent_memoized_functions.append(func) 429 arg_dict = ctx_dict.setdefault(cur_ctx, {}) --> 430 result = func(_args) 431 arg_dict[args] = result 432 return result
C:\Anaconda3\lib\site-packages\skcuda\misc.py in _get_binaryop_vecmat_kernel(dtype, binary_op) 858 ctype = dtype_to_ctype(dtype) 859 tmpl = template.substitute(type=ctype, binary_op=binary_op) --> 860 mod = SourceModule(tmpl) 861 862 add_row_vec_kernel = mod.get_function('opRowVecToMat')
C:\Anaconda3\lib\site-packages\pycuda\compiler.py in init(self, source, nvcc, options, keep, no_extern_c, arch, code, cache_dir, include_dirs) 257 258 cubin = compile(source, nvcc, options, keep, no_extern_c, --> 259 arch, code, cache_dir, include_dirs) 260 261 from pycuda.driver import module_from_buffer
C:\Anaconda3\lib\site-packages\pycuda\compiler.py in compile(source, nvcc, options, keep, no_extern_c, arch, code, cache_dir, include_dirs, target) 247 options.append("-I"+i) 248 --> 249 return compile_plain(source, options, keep, nvcc, cache_dir, target) 250 251
C:\Anaconda3\lib\site-packages\pycuda\compiler.py in compile_plain(source, options, keep, nvcc, cache_dir, target) 135 raise CompileError("nvcc compilation of %s failed" % cu_file_path, 136 cmdline, stdout=stdout.decode("utf-8", "replace"), --> 137 stderr=stderr.decode("utf-8", "replace")) 138 139 if stdout or stderr:
CompileError: nvcc compilation of C:\Users\Brian\AppData\Local\Temp\tmpshxtpail\kernel.cu failed [command: nvcc --cubin -arch sm_50 -m64 -Ic:\anaconda3\lib\site-packages\pycuda\cuda kernel.cu] [stdout: kernel.cu ] [stderr: kernel.cu(6): error: identifier "int32_t" is undefined
kernel.cu(6): error: identifier "int32_t" is undefined
2 errors detected in the compilation of "C:/Users/Brian/AppData/Local/Temp/tmpxft_000010a8_00000000-8_kernel.cpp1.ii". ]
In pycuda_handler, can we do a check to see if int32_t is defined, and, if not, add this code to one of the kernel(s):
typedef int int32_t;
?
Figured out where the int32_t's are originating: the misc module in scikit-cuda, lines 822-3. As a quick hack I changed those to ints, and now this works. Note sure if anything can be done within brainstorm to account for this. Probably not. Needless to say, I don't like having to hack another module to get this working...Windows users (which this problem will most likely affect) might have to make some changes to their CUDA setup so that int32_t is defined?
scikit-cuda.misc appears to use int in other kernels, but int32_t in
this kernel. Perhaps this can be changed in scikit-cuda so that future
versions will not have this problem. I'm sure they would like improved
Windows compatibility. Perhaps you would like to submit a PR to them on
Github?
On 21 November 2015 at 23:02, Brian Thomas [email protected] wrote:
Figured out where the int32_t's are originating: the misc module in scikit-cuda, lines 822-3. As a quick hack I changed those to ints, and now this works. Note sure if anything can be done within brainstorm to account for this. Probably not. Needless to say, I don't like having to hack another module to get this working...Windows users (which this problem will most likely affect) might have to make some changes to their CUDA setup so that int32_t is defined?
— Reply to this email directly or view it on GitHub https://github.com/IDSIA/brainstorm/issues/100#issuecomment-158685223.
I will do so. Thanks