dace icon indicating copy to clipboard operation
dace copied to clipboard

Parallel execution within SDFG connected component

Open TizianoDeMatteis opened this issue 3 years ago • 0 comments

Describe the bug When executing a connected component, of an SDFG State, independent "subcomponents" are not executed in parallel.

To Reproduce

Consider the following DaCe program:

import dace
import numpy as np

N = dace.symbol('N', dace.int32)


@dace.program
def prog(x: dace.float32[N], y: dace.float32[N], v: dace.float32[N], w: dace.float32[N]):
    return np.dot(x,y) + np.dot(v,w)



size = 16
x = np.random.rand(size).astype(np.float32)
y = np.random.rand(size).astype(np.float32)
v = np.random.rand(size).astype(np.float32)
w = np.random.rand(size).astype(np.float32)

sdfg = prog.to_sdfg()
res = sdfg(x=x, y=y, v=v, w=w, N=size)
assert np.allclose(res, np.dot(x,y) + np.dot(v,w))

It computes res = np.dot(x,y) + np.dot(v,w). The two dot products are independent, but, by looking at the generated code, they are executed sequentially one after the other:

void __program_prog_internal(/* ... */){
    // ..
    _Dot__sdfg_1_0_0_10(__state, &x[0], &y[0], __tmp0, N);
    _Dot__sdfg_1_0_0_10(__state, &v[0], &w[0], __tmp1, N);
}

Expected behavior

The two dot products should have been executed in parallel, via openmp sections.

Note: the openmp_sections config flag is already set to true, but, from the description, it seems to refer to parallel execution between connected components, not inside a connected component.

It could also be an ad-hoc transformation, that, in the simplest case, uses state fission to create states with independent components.

TizianoDeMatteis avatar May 05 '21 07:05 TizianoDeMatteis