BabelStream
BabelStream copied to clipboard
Add Python implementation
Numba seems to be the Nvidia recognised way of CUDA programming with Python. Numba supports direct kernel programming similar to how it's done in Julia where the annotated code/method is intercepted and then compiled to PTX, HSA (for AMD ROC) and vectorised CPU code. It's not clear whether Intel GPU is supported so the PR should explore this as well.