Use of undeclared identifier, when summing two axes
The code below crashes Bohrium. Both sums seem to be required for the crash to occur.
dim = 16
c = np.arange(dim**3, dtype=np.float32).reshape((dim,dim,dim))
k3 = np.sum(c, axis=2)
k4 = np.sum(c, axis=1)
The kernel is simply missing the array in global memory, it tries to write to.
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#include <kernel_dependencies/complex_opencl.h>
#include <kernel_dependencies/integer_operations.h>
__kernel void execute_9775374476049381469(ulong vo0, ulong vs0_0, ulong vs0_1, ulong vo1, ulong vs1_0, ulong vs1_1, ulong vs1_2, const long c1, const long c2, const float c3, const float c4) {
// The IDs of the threaded blocks:
const uint g0 = get_global_id(0); if (g0 >= 16) { return; } // Prevent overflow
{const ulong i0 = g0;
for (ulong i1 = 0; i1 < 16; ++i1) {
const ulong idx0 = (vo0 +i0*vs0_0 +i1*vs0_1);
a0[idx0] = c4;
}
for (ulong i1 = 0; i1 < 16; ++i1) {
float t1;
t1 = c3;
for (ulong i2 = 0; i2 < 16; ++i2) {
uint t2;
float t3;
float t0;
t2 = (vo1 +i0*vs1_0 +i1*vs1_1 +i2*vs1_2);
t3 = t2;
t1 += t3;
t0 += t3;
}
}
}
}
Here is a simpler example. The dimensions and axis seems to play a role.
dim = 50
a = np.arange(dim**4, dtype=np.float32).reshape((dim,dim,dim,dim))
k = np.sum(a, axis=1)
These work fine on my MacBook with Bohrium compiled from master. As a side effect, I print k or k3 and k4. Is this still an issue?
I don't have access to my test-machine again before monday. But If I remember correct, the problem disappears when you the print the variables, as this will allocate the arrays in global memory.
Then indeed mine also crash when executing
import bohrium as np
dim = 50
a = np.arange(dim**4, dtype=np.float32).reshape((dim,dim,dim,dim))
k = np.sum(a, axis=1)