neon Dtype issues with gpu backend

Dtype issues with gpu backend

Open zhiltsov-max opened this issue 6 years ago • 1 comments

Hello, I was experimenting with Neon and had faced an issue with the convolutional and pooling layers. The task was image classification, so the input data shape was (3, H, W). If an ArrayIterator or HDF5Iterator are used as datasets, then the input shape values might have numpy datatypes like numpy.int64 (for ArrayIterator it is provided by lshape parameter, for HDF5Iterator they are retrieved from file['input'].attrs['lshape']). When these values are passed to the model configure method as in_obj, they are assigned to the layer.in_shape. After this, in_shape is used to initialize layer parameters. Next, during the forward pass, the following errors arise:

conv layer:

  File "<user>/neon/backends/nervanagpu.py", line 1990, in fprop_conv
    return self._execute_conv("fprop", layer, layer.fprop_kernels, repeat)
  File "<user>/neon/backends/nervanagpu.py", line 2072, in _execute_conv
    kernels.execute(repeat)
  File "<user>/neon/backends/convolution.py", line 224, in execute
    kernel.prepared_async_call(*self.launch_args, shared_size=self.shared)
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/driver.py", line 516, in function_prepared_async_call
    func._launch_kernel(grid, block, arg_buf, shared_size, stream)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type numpy.int64

pool layer:

  File "<user>/neon/backends/nervanagpu.py", line 2316, in fprop_pool
    layer.fprop_lut_size, repeat)
  File "<user>/neon/backends/nervanagpu.py", line 2349, in _execute_pool
    kernel.prepared_async_call(*params, shared_size=shared)
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/driver.py", line 516, in function_prepared_async_call
    func._launch_kernel(grid, block, arg_buf, shared_size, stream)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type numpy.int64

memory allocation in conv:

  File "<user>/neon/backends/convolution.py", line 1307, in bind_params
    input_data = self.lib.scratch_buffer_offset(self.size)
  File "<user>/neon/backends/nervanagpu.py", line 875, in scratch_buffer_offset
    data = int(_get_scratch_data(self.scratch_size)) + self.scratch_offset
  File "<decorator-gen-62>", line 2, in _get_scratch_data
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/tools.py", line 430, in context_dependent_memoize
    result = func(*args)
  File "<user>/neon/backends/nervanagpu.py", line 3287, in _get_scratch_data
    return drv.mem_alloc(scratch_size)
Boost.Python.ArgumentError: Python argument types in
    pycuda._driver.mem_alloc(numpy.int64)
did not match C++ signature:
    mem_alloc(unsigned long)

Layer parameters:

In "<>/neon/backends/convolution.py", line 75, in __init__:
(N, C, K, D, H, W, T, R, S, M, P, Q, pad_d, pad_h, pad_w, str_d, str_h, str_w, dil_d, dil_h, dil_w)

Have following values (idx, type, value):

[(0, <class 'int'>, 128), (1, <class 'numpy.int64'>, 3), (2, <class 'int'>, 32), (3, <class 'int'>, 1), (4, <class 'numpy.int64'>, 128), (5, <class 'numpy.int64'>, 128), (6, <class 'int'>, 1), (7, <class 'int'>, 3), (8, <class 'int'>, 3), (9, <class 'int'>, 1), (10, <class 'numpy.int64'>, 128), (11, <class 'numpy.int64'>, 128), (12, <class 'int'>, 0), (13, <class 'int'>, 2), (14, <class 'int'>, 2), (15, <class 'int'>, 1), (16, <class 'int'>, 1), (17, <class 'int'>, 1), (18, <class 'int'>, 1), (19, <class 'int'>, 2), (20, <class 'int'>, 2)]

Casting all parameters to int in layer initialization fixes the issue for me, but it seems not like a proper solution. Casting elements of lshape to int also helps. I think it would be great if the input values be checked or be converted to the expected types on the library side. Other layer types (like linear, batchnorm, recurrent, etc.) and backends (cpu, mkl) which I had used, had not shown to suffer from this issue.

Environment: python 3.5.2, neon 2.6.0 (f9d771bbb5f5fa3ae129748596d0ced5389c7f88), cuda 8.0, gpu K40s, ubuntu 16.04, boost 1.58.0, pycuda 2017.1.1, numpy 1.13.1.

Apr 21 '18 13:04 zhiltsov-max

@zhiltsov-max Agreed. A type check is needed here.

Apr 23 '18 22:04 baojun-nervana

neon neon copied to clipboard

Dtype issues with gpu backend

neon
neon copied to clipboard