Alexander Sinn

Results 23 comments of Alexander Sinn

For CUDA GPUs, memory can be accessed in 4, 8 or 16-byte chunks, with the next thread reading the next chunk of memory. Everything else is slower. If you were...

Yes. Basically, treat node-centered data the same as cell-centered data, just with one extra ghost cell in the hi direction and a 0.5 dx offset when converting to and from...

The alignment requirement is needed for GPUs to get coalesced memory accesses into an array of GPU complexes. If T is a SIMD type, these should not be stored in...