OpenCL-CTS
OpenCL-CTS copied to clipboard
Thread dimensions optimization
Implements the optimization described by #1397.
This doesn't affect most of our devices with typical maximum work-group sizes, but for our FPGA emulation device with a very large maximum work-group size (Max work item sizes 67108864x67108864x67108864) it improves the execution time of this test by more than 10 seconds.
This has two approvals - can we merge this? I'd prefer not to merge my own PR. Thanks!
2024/07/02 teleconference: @bashbaug This needs rebasing.