Tom Lin
Tom Lin
Closing as ComputeCpp is EOL.
Addressed in v2 branch
Update: it's CUDA's wgsize (propagates to threads per blocks) that's failing, PPWI is the one that's define at compile time.
Thanks for reporting this, we're working on a new release which has a unified CMake build system. I'll leave this open until then.
main branch now uses the unified build system and has been tested with the latest oneAPI release, please reopen if the problem persists.
I suspect it's something to do with the VLA usage for `etot` and friends. This is also a problem for SYCL as we can't use VLA, will probably have to...
Sorry, SYCL doesn't require compilation so I immediately replace the macro `NUM_TD_PER_THREAD` with the actual `wgSize` during the port and then I confused myself thinking the original codebase uses VLA.
LGTM
The CMake issue will be a separate PR