ndgrigorian

Results 12 comments of ndgrigorian

I was able to replicate this issue locally. Using Sasha's code: In ``` (array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array([0, 0, 0, 0, 0, 0, 0,...

``` In [1]: import dpctl.tensor as dpt, dpctl, numpy as np In [2]: x = dpt.arange(10) In [3]: type(x).__dlpack__(type(x)) Segmentation fault ``` Same seg fault seems to occur with some...

The rounding mode is not exactly at fault here. Per array API > Rounds the result of dividing each element x1_i of the input array x1 by the respective element...

> This PR should address issues test failures from [gh-1378](https://github.com/IntelPython/dpctl/issues/1378) Nit but I think you meant gh-1375 here.

@vlad-perevezentsev These discrepancies seem to have been resolved recently. ``` In [1]: import dpctl.tensor as dpt, numpy as np In [2]: a = dpt.asarray([0], dtype='c16', device='cpu') In [3]: dpt.pow(a,1) Out[3]:...

This was an intentional change. The original implementation aligned with Numpy's indices wrapping, but was found to perform poorly. This style of wrapping is more closely aligned with advanced indexing,...

According to [NEP-50](https://numpy.org/neps/nep-0050-scalar-promotion.html#t4), which has been the guideline for our binary operation type promotion, this case should probably raise an exception, since it's a Python scalar that's too large for...

> According to [NEP-50](https://numpy.org/neps/nep-0050-scalar-promotion.html#t4), which has been the guideline for our binary operation type promotion, this case should probably raise an exception, since it's a Python scalar that's too large...

Various changes to dpctl element-wise comparisons have been made to enable this edge case. It now works as expected. ``` In [3]: dpt.less_equal(dpt.asarray(2, dtype=np.int32), np.iinfo(np.uint32).max) Out[3]: usm_ndarray(True) ```

Testing has been performed and little-to-no significant performance gains were found for unary functions using `sycl::vec` overloads. TODO: benchmark with sub-group loading disabled as well.