ndgrigorian comments

Results 12 comments of


                                            ndgrigorian

Call to numba_dpex.kernel corrups data in arguments

I was able to replicate this issue locally. Using Sasha's code: In ``` (array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array([0, 0, 0, 0, 0, 0, 0,...

segfault in calling __array_namespace__ on a usm_ndarray class

``` In [1]: import dpctl.tensor as dpt, dpctl, numpy as np In [2]: x = dpt.arange(10) In [3]: type(x).__dlpack__(type(x)) Segmentation fault ``` Same seg fault seems to occur with some...

`floor_divide` returns different result for arrays of floating dtype on GPU and CPU devices

The rounding mode is not exactly at fault here. Per array API > Rounds the result of dividing each element x1_i of the input array x1 by the respective element...

Custom implementation of isfinite

> This PR should address issues test failures from [gh-1378](https://github.com/IntelPython/dpctl/issues/1378) Nit but I think you meant gh-1375 here.

dpt.pow() with dtype=`c16` and scalar on gpu/cpu returns different result

@vlad-perevezentsev These discrepancies seem to have been resolved recently. ``` In [1]: import dpctl.tensor as dpt, numpy as np In [2]: a = dpt.asarray([0], dtype='c16', device='cpu') In [3]: dpt.pow(a,1) Out[3]:...

dpctl.tensor.put with mode="wrap" and negative indexes

This was an intentional change. The original implementation aligned with Numpy's indices wrapping, but was found to perform poorly. This style of wrapping is more closely aligned with advanced indexing,...

less_equal returns incorrect result

According to [NEP-50](https://numpy.org/neps/nep-0050-scalar-promotion.html#t4), which has been the guideline for our binary operation type promotion, this case should probably raise an exception, since it's a Python scalar that's too large for...

less_equal returns incorrect result

> According to [NEP-50](https://numpy.org/neps/nep-0050-scalar-promotion.html#t4), which has been the guideline for our binary operation type promotion, this case should probably raise an exception, since it's a Python scalar that's too large...

less_equal returns incorrect result

Various changes to dpctl element-wise comparisons have been made to enable this edge case. It now works as expected. ``` In [3]: dpt.less_equal(dpt.asarray(2, dtype=np.int32), np.iinfo(np.uint32).max) Out[3]: usm_ndarray(True) ```

sycl::vec overloads for elementwise functions

Testing has been performed and little-to-no significant performance gains were found for unary functions using `sycl::vec` overloads. TODO: benchmark with sub-group loading disabled as well.