Adarsh Yoga
Adarsh Yoga
While experimenting with a simple Intel TBB program (that checks for prime numbers in parallel) I noticed a difference in the projected speedup of the program while profiling for throughout...
The execution scope of `group_barrier` needs to be determined based on the `group` argument. Currently it is hard-coded to `MemoryScope.WORK_GROUP` since numba-dpex only supports `group_barrier` on work groups. Once sub-groups...
The [blackscholes numpy implementation in dpbench](https://github.com/adarshyoga/dpbench/blob/main/dpbench/benchmarks/black_scholes/black_scholes_numba_dpex_n.py) is ~26X slower than the corresponding kernel and prange implementations. How to reproduce: 1) Follow [instructions ](https://github.com/adarshyoga/dpbench)to setup dpbench. 2) Run blackscholes - `python...
The [pairwise distance numba implementation](https://github.com/adarshyoga/dpbench/blob/main/dpbench/benchmarks/pairwise_distance/pairwise_distance_numba_dpex_n.py) with numpy calls fails since numba-dpex does not currently support dpnp.sum calls with non-default axis. See failing code snippet below. ``` @dpjit def pairwise_distance(X1, X2,...
The [numba-dpex implementation of the PCA algorithm](https://github.com/adarshyoga/dpbench/blob/main/dpbench/benchmarks/pca/pca_numba_dpex_n.py) has several calls that are currently not supported - dpnp.mean(axis=0), dpnp.linalg.eigh, etc., inside a dpjit decorated function. These functions need to supported to...
Knn workload in dpbench fails since it allocates memory inside prange as shown below. To successfully executee knn, numba-dpex needs to add support for allocating memory inside prange loops. ```...