numba-dpex icon indicating copy to clipboard operation
numba-dpex copied to clipboard

Implement a kernel target for devices that do not support float64 dtype

Open chudur-budur opened this issue 1 year ago • 0 comments

Currently, numba-dpex uses the same target context to generate kernels for all types of SPIR-V supporting devices. One of the limitations of the design is that there is no easy way to limit the supported signatures for math functions (and probably built-in operators) on devices that do not support some types of floating point widths. It can lead to kernel compilation errors such as the one described in #960:


math.ceil and math.floor do not to work with devices that do not support float64 compute (such as igpus), and this PR does not solve that point.

Even if casting everything float32 beforehand, e.g:

import dpctl.tensor as dpt
import dpnp
import numba_dpex as dpx
import math

N = 10

cst = dpnp.float32(0.5)

@dpx.kernel
def func(a):
    i = dpx.get_global_id(0)
    i = math.ceil(dpnp.float32(i) + cst)
    a[i] = a[i] + i

a = dpt.ones(N, dtype=dpnp.float32)

func[dpx.Range(N)](a)

print(dpt.asnumpy(a))

gives a SyclProgramCompilationError: which is the error we're used to for float64 kernels on float32-only hardware.


The problem in the above examples seems to be that built-in add operator is upcasting to float64. A solution may be to split up DpexKernelTargetContext into two separate contexts: DpexFP32KernelTargetContext and DpexFP64KernelTargetContext.

These two separate contexts can then have two separate registries for math functions and built-in operators that register the different architecture aware flavours of functions. At the point of compiling we can pick the right context based on what device is targeted.

chudur-budur avatar Mar 23 '23 01:03 chudur-budur