Phil Miller - NOAA

Results 319 comments of Phil Miller - NOAA

Actually, my question about `cudaEventRecord` and `cudaEventQuery` being thread safe applies to those other CUDA API functions as well - if they're not, then we need the lock to encompass...

Now for the other side of the coin: are we worried that the lock on every kernel launch is going to present unacceptable added overhead?

Does SYCL need locks added as well?

Crud, there was another comment I thought I'd posted earlier, but Github seems to have eaten it, referencing TOCTTOU issues with multi-threaded callers using this functionality. Any multi-threaded callers that...

If we want to meaningfully support multi-threaded callers without external synchronization between them, we'd need an API more analogous to creating and recording explicit events on the stream, with each...

One partial way to address the multi-threading concerns would be to rename to something like `submitted_work_has_completed` and reverse the sense accordingly. If it returns `true`, then the caller can be...

This is a **very** rough draft. * It makes no effort to be entirely clean with respect to memory spaces * Its type genericity has room for improvement * I've...

The algorithm implemented here is entirely naive, iterating a single bit at a time. I haven't done any performance testing on it yet. The GPU Gems section that Nic derived...

Ugh, I missed that I wasn't exercising the trickiest version of the templates in the test as pushed

The handling of execution and memory spaces is somewhat sloppy right now, in that the working memory gets allocated in the default (device) memory space, and the code just assume...