Franck Charras comments

Results 112 comments of


                                            Franck Charras

Implementing a reasonably fast `matmul` kernel with `numba_dpex.kernel`

Reporting some progress on this issue. I've ran a more exhaustive grid search of all possible combinations of performance parameters and it improved up to 70% of dpnp performance which...

Implementing a reasonably fast `matmul` kernel with `numba_dpex.kernel`

With full grid search we achieve 90% of dpnp performance which is even way above initial expectations !

Implementing a reasonably fast `matmul` kernel with `numba_dpex.kernel`

Sorry, this was a false flag, the autotuner was indeed able to find the fastest combination of parameters, but we had a bug for some of them that caused the...

Implementing a reasonably fast `matmul` kernel with `numba_dpex.kernel`

The implementation I benchmarked in https://github.com/soda-inria/sklearn-numba-dpex/pull/102 explicitly load all the sliding windows in shared memory, but I believe I've seen implementations that rather rely on cache to implicitly enable fast...

Support for RNG in `numba_dpex`

`numba.cuda.random` RNG does not come from low level functions but [is implemented in numba](https://github.com/numba/numba/blob/main/numba/cuda/random.py) so in fact the current state of `numba.cuda.random` is easy to port or to mimic. E.g...

Support for RNG in `numba_dpex`

We just merged it see https://github.com/soda-inria/sklearn-numba-dpex/commit/6190f8f2ffc9a3872ac07a58137a7c59131966a8 for module and tests. It's true that jax rng interface is nicer but the xoroshiro128 pr was on its way to merge before we...

Support for RNG in `numba_dpex`

Hello @diptorupd sure I can do that, TY for the invitation ! I will be busy early this week, I'll start working on it mid-week if that's fine for you.

Informative error message when no sycl device are available

I think it's fixed and the issue can be closed @ogrisel https://github.com/IntelPython/numba-dpex/blob/main/numba_dpex/config.py#L36-L45

wrong output dtype for math functions

(Sorry for the lack of feedback this week, I took some off time.) Practically speaking, I wouldn't say this issue is too bothersome, it's more a matter of clarity. Python...

wrong output dtype for math functions

For me, #960 indeed fixes the issue. Just want to add that I realized this mistake I made in the OP: when saying > it returns a `float64` I didn't...