occa icon indicating copy to clipboard operation
occa copied to clipboard

Pass variables outside @outer loops to launched kernels as additional arguments

Open camierjs opened this issue 6 years ago • 5 comments

kernel void k(void *qdata, const double *u){
   for (CeedInt i=0; i<Q; i++; outer) {
   const CeedScalar *J=u+Q*NC;
   CeedScalar *qd = (double*) qdata;
   qd[...] = J[...];
}

is fine but if we put the variable declaration of J before the for loop, CUDA kernel won't get the right pointer J:

kernel void k(void *qdata, const double *u){
   const CeedScalar *J=u+Q*NC;
   CeedScalar *qd = (double*) qdata;
   for (CeedInt i=0; i<Q; i++; outer) {
       qd[...] = J[...];
}

camierjs avatar Feb 27 '18 16:02 camierjs

I think the pointer is being incremented before being passed to the kernel, resulting in some weird pointer (since the pointer is actually a handle to the CUDA pointer)

dmed256 avatar Feb 27 '18 23:02 dmed256

Isn't this just an issue with address spaces?

pdhahn avatar Mar 02 '18 07:03 pdhahn

As for me, I learned to take care not to do what is shown in the example. :-) For pre-1.0 OCCA at least, the address space issue w.r.t. code located inside or outside the for-outer-inner block (actual nested kernel) was a learning curve for me personally.

pdhahn avatar Mar 02 '18 07:03 pdhahn

BTW see also somewhat-related #82. At least, similar in terms of address space issues that are involved, I think.

pdhahn avatar Mar 02 '18 08:03 pdhahn

I understand the issue, and how the parser instantiates the nested kernels later in the code. But what bothers me is that the provided kernel should be in 'his space' right away, not depending on the location of the for-loops in its body. Once aware of this behavior, that's ok!

camierjs avatar Mar 02 '18 18:03 camierjs