occa icon indicating copy to clipboard operation
occa copied to clipboard

Don't override argument names with @ outer and @ inner iterator variables

Open dmed256 opened this issue 5 years ago • 2 comments

@kernel void kernel(float *x) {
  for (int x = 0; x < 10; ++x; @outer) {...}
}

ends up passing the iterator

int x

to the launched kernel instead of the kernel argument

float *x

dmed256 avatar Apr 03 '19 18:04 dmed256

In parsers I write (e.g., using cpp-peglib), I simply internally rename all variables using a suffix based on line number and character position on the line at the point of definition.

pdhahn avatar Jul 05 '19 20:07 pdhahn

Normally there wouldn't be an issue due to the scope difference between the for loop and the kernel arguments. This isn't the case due to extracting out the @outer for loop variables out.

@kernel void kernel(float *x) {
  for (int x = 0; x < 10; ++x; @outer) {
    for (int i = 0; i < 1; ++i; @inner) {}
  }
}

---[ Laucher ]--------------------------

#include <occa/modes/serial/kernel.hpp>
#include <occa/core/base.hpp>
#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <stdint.h>
#include <occa.hpp>

using namespace std;
using namespace occa;

extern "C" void kernel(occa::modeKernel_t **deviceKernel,
                       float *x) {
  {
    occa::dim outer, inner;
    outer.dims = 1;
    inner.dims = 1;
    int x = 0;
    outer[0] = 10 - 0;
    int i = 0;
    inner[0] = 1 - 0;
    occa::kernel kernel(deviceKernel[0]);
    kernel.setRunDims(outer, inner);
    kernel(x);
  }
}
========================================

---[ Kernel ]---------------------------


extern "C" __global__ void _occa_kernel_0(float *x) {
  {
    int x = 0 + blockIdx.x;
    {
      int i = 0 + threadIdx.x;
    }
  }
}
========================================

For newly created variables, there is usually a _occa_ prefix attached to them, such as _occa_tiled_x in this example:

@kernel void kernel(float *x) {
  for (int x = 0; x < 10; ++x; @tile(16, @outer, @inner)) {
  }
}

extern "C" __global__ void _occa_kernel_0(float *x) {
  {
    int _occa_tiled_x = 0 + (16 * blockIdx.x);
    {
      int x = _occa_tiled_x + threadIdx.x;
      if (x < 10) {}
    }
  }
}

dmed256 avatar Jul 06 '19 00:07 dmed256