Diptorup Deb comments

Results 54 comments of


                                            Diptorup Deb

Lower than expected performance in blackscholes numpy implementation

The slowdown maybe related to kernel launch overhead in the `JitKernel` custom dispatcher class. Overhead is especially noticeable with small problem sizes. The `experimental.dispatcher.KernelDispatcher` fixes the launch overhead. Can you...

Support for RNG in `numba_dpex`

@fcharras sorry for getting to this issue so late. Will you want to open a PR contributing your RNG implementation to numba-dpex? We can review and merge it.

Support for RNG in `numba_dpex`

> I will be busy early this week, I'll start working on it mid-week if that's fine for you. I am out this week. If you want to start next...

numpy sum operator (axis) isn't supported

There are two issues here: 1. Parfor does not support `sum` with the `axis` keyword. 2. GPU kernel generation for reductions is not yet supported. @DrTodd13 can you take a...

wrong output dtype for math functions

@fcharras The issue here is that the `math.ceil` and `math.floor` functions are replaced by the SYCL equivalents that only support floating point values. We are looking at a solution where...

sporadic inaccurate results relative to numpy if atomic add is used

Updated the reproducer to latest API and I can reproduce the freeze/deadlock reported previously: ```python import argparse import math import dpctl import dpnp import numpy as np import numpy.random as...

sporadic inaccurate results relative to numpy if atomic add is used

> Updated the reproducer to latest API and I can reproduce the freeze/deadlock reported previously: > I experience the issue on a Gen9 integrated graphics only at problem size `2**18`...

Numba-dpex examples suggestion

@mingjie-intel have a look. These suggested use cases can serve as good motivating examples for your reduction kernel work.

Numba-dpex examples suggestion

@roxx30198 The examples suggested by @oleksandr-pavlyk are a good starting point for you to get familiarized with numba-dpex and parallel-programming in general.

Function from ramba fails to compile when dpjit decorated.

@DrTodd13 fixed! The formatting that is :wink:, I will take a look and provide an update.