VC4CL icon indicating copy to clipboard operation
VC4CL copied to clipboard

Compiler threw exception: Normalizer: Invalid local type for memory area

Open shazz opened this issue 4 years ago • 6 comments

Hi,

I try to compile some existing OpenCL kernels (tested on NVIDIA GPUs) and during the compilation, I have this error:

[E] Thu Jul 30 11:45:43 2020: Compiler threw exception: Normalizer: Invalid local type for memory area: (g) f32* %flag.0

What does it mean ?

Thanks!

shazz avatar Jul 30 '20 18:07 shazz

This means that there is some OpenCL C code that VC4C does not handle correctly:)

Do you have access to the source code of the OpenCL kernel? I.e. you can set the environment variable VC4CL_DEBUG=code which will dump the source code to some temporary file.

doe300 avatar Jul 30 '20 18:07 doe300

Ah ah :D Ah.. yes an it has to be a env var for root :) I forgot. Ok it dumped all the opencl code but I had the sources :)

is there a way to find on which line the error occurred ? That's a pretty large collection of kernels (8000 lines)

shazz avatar Jul 30 '20 20:07 shazz

No, not really, at this point there is no association anymore between the intermediate code and the original source code positions.

Judging by the error message, the variable accessed is a __global (or global) float pointer and is named something with flag (the other parts of the name are added by the VC4C compiler). So you could check if this narrows the candidates down. But it is also possible that this variable is introduced by CLang, in which case that does not help...

doe300 avatar Jul 31 '20 05:07 doe300

Ok I'll split the source by functions and one by one I'll try to build and check the result.

shazz avatar Jul 31 '20 10:07 shazz

So, by splitting by kernel functions, I was able to find the various problems (and some solutions): I had the following compilation errors (examples):

  • General: Can't emplace at the start of a basic block: nop (register) (delay)
  • General: No CFG edge found for branch to target: label %if.then70
  • Normalizer: Invalid local type for memory area: (g) f32* %psi.0
  • Normalizer: Not normalized instruction found: i32 %call193 = i32 printf(i32 %global_data_address) (unsigned group_uniform splat)

Then, what I was able to fix:

  • I fixed the Not normalized instruction by just commenting the printf
  • The Normalizer: Invalid local type for memory area were due to the fact some pointers were defined like that:
__kernel void some_function(__global REAL *var) {

  __global REAL *flag;
  if (...) {
    flag = &var[FLAGU * size];

So I fixed it by changing the declaration to a default address (NULL or 0 doesn't work) like __global REAL *flag = &var[FLAGU * size];

  • the Can't emplace at the start of a basic block: nop (register) (delay) is a tricky one, for a given file it doesn't always happen , sometimes I just try to compile again and it works. For some others, never.

  • the No CFG edge found for branch to target happens for some specific if statements and it looks to be related to some memory issues (or branch distance at least):

this one fails

__kernel void some_fn(__global int *BINDEX) {
  int it = get_global_id(0);
  ...
  if (BINDEX[8 * size + it] == NORMAL_X) {

meaning 8* doesn't work but 5* works

Any... comment ?

shazz avatar Aug 03 '20 14:08 shazz

First of all, thanks for the investigation:)

I fixed the Not normalized instruction by just commenting the printf

Yes, printf is not supported yet (and it won't be for a long time probably).

The Normalizer: Invalid local type for memory area were due to the fact some pointers were defined like that:

This one, I fear, will be hard to fix, but I guess I should have a look at it. Can you post a full (minimal) kernel code that has this kind of code lines in it?

the Can't emplace at the start of a basic block: nop (register) (delay) is a tricky one, for a given file it doesn't always happen , sometimes I just try to compile again and it works. For some others, never.

and

the No CFG edge found for branch to target happens for some specific if statements and it looks to be related to some memory issues (or branch distance at least):

may be fixed by the latest commits on the devel branch.

For the last one, what do you mean by "this one fails"? Does it not compile or does the execution fail? Here too, a full (minimal) kernel code example would be helpful.

doe300 avatar Aug 03 '20 16:08 doe300