Pekka Jääskeläinen
Pekka Jääskeläinen
...and the plan was to adapt the loop interchange pass to understand the parloop MD to perhaps get the outer loops switched to inner loops (selectively, when it's beneficial), as...
Should not be!
In fact, I was going to ask if you would like to write a new updated "blog post" like that, but I was thinking that maybe a better time for...
Nice! Did you already check what slows down the pathological cases?
How does this look: could we get an interesting CUDA update blog at the 1.6 release?
Is this one ready?
Please send a PR if you wish to add this.
Sure, we can keep a wish list which might be useful if someone looks for ways to contribute.
The original idea was indeed to be able to define/restrict the platform's devices explicitly with the POCL_DEVICES list. It has worked well for testing with the CPU devices (and ttasim),...