devito
devito copied to clipboard
linalg.py failing with PGI openacc
Failing on the reduction clause to reproduce: pgi+openacc
export DEVITO_LANGUAGE=openacc
export DEVITO_PLATFORM=nvidiaX
export DEVITO_ARCH=pgcc
export DEVITO_LOGGING=DEBUG #optional
python3 misc/linalg.py mat-vec
Some discussion: https://devitocodes.slack.com/archives/CQ0AT90R0/p1607003860367600
@Leitevmd have you guys ever encountered this issue? this could potentially be relevant to you
After merging https://github.com/devitocodes/devito/pull/2226 the generated code for the above MFE is:
#pragma acc parallel loop present(A,b,x)
for (int i = i_m; i <= i_M; i += 1)
{
for (int j = j_m; j <= j_M; j += 1)
{
b[i] += x[j]*A[i][j];
}
}
instead of
START_TIMER(section0)
#pragma acc parallel loop collapse(2) reduction(+:b[0:b_vec->size[0]]) present(A,b,x)
for (int i = i_m; i <= i_M; i += 1)
{
for (int j = j_m; j <= j_M; j += 1)
{
b[i] += x[j]*A[i][j];
}
}
STOP_TIMER(section0,timers)
The above generated code is working but probably not optimal. Should we close this issue or rename it so as to improve with a reduction sub-pass?