numba-dpex icon indicating copy to clipboard operation
numba-dpex copied to clipboard

Data Parallel Extension for Numba

Results 107 numba-dpex issues
Sort by recently updated
recently updated
newest added

We want something like `dpexcli compile -m -n -o ` to generate llvm code of function without running it. TODO: - think about passing arguments type

enhancement

Support for `@guvectorize` is missing. Features: - [ ] Passing intra-device arrays - [ ] Launch asynchronous - [ ] Calling Device Functions - [ ] Explicitly control the maximum...

enhancement

We currently do not have anything similar to `@cuda.reduce` and the output of this step should be a design to support a similar `@reduce` decorator for `numba-dppy`. Features to implement:...

enhancement

Built-in types - [x] complex - [ ] bool - [ ] None - [ ] tuple

enhancement

dpex ufunc kernels cannot be called from other dpex device functions. Example: ```python @dppy.func def a_device_function(a): return a + 1 @vectorize(nopython=True) def ufunc_kernel(x, y): return a_device_function(x) + y def test_ufunc():...

enhancement

When set to “1”, IGC will write number of dumps into /tmp/IntelIGC. ```python $ export IGC_ShaderDumpEnable=1 ``` To read the DWARF of a kernel, we first need a copy of...

debug

@akharche @reazulhoque The "--spirv-debug-info-version=ocl-100" in `spirv_generator.generate` is not used in any place. Is the flag essential for us to support GDB? If not the dead code should removed. _Originally posted...

debug

The following does not work: ```python import numba_dpex as dpex from numba import float32 import dpctl import dpnp import numpy as np @dpex.kernel def kernel(array_i): i = dpex.get_global_id(0) array[i] =...

user

The following snippet highlights differences regarding loop unrolling between `numba` and `numba_dpex`. Regular `numba` [will try to unroll loops](https://numba.pydata.org/numba-doc/latest/user/faq.html#why-my-loop-is-not-vectorized) but the same behavior is not seen with `numba_dpex`, as it...

enhancement
user

Running into this issue when implementing l2_norm into dpebench: Here is the code: ``` @nb.njit(parallel=False, fastmath=True) def l2_norm(a, d): sq = np.square(a) sum = sq.sum(axis=1) d[:] = np.sqrt(sum) ``` Here...

enhancement