burn
burn copied to clipboard
Autotune WGPU: binary_elemwise kernel
Use the autotune mechanism in the WGPU backend to find the fastest kernel version of binary_elemewise and its inplace counterpart, by varying the WORKGROUP argument.
Take inspiration from the burn-wgpu/src/kernel/matmul/tune
folder.