loopy
loopy copied to clipboard
[loopy.match] Targeting reduction expression in an instruction
In the following kernel:
knl = lp.make_kernel(
["{[i]: 0<=i<8}",
"{[j]: 8<=j<10}"],
"""
out = sum(i, x[i]) + sum(j, x[j])
""")
It seems that there's no way to prefetch x
in the sum-reduce over i
. The root of the problem seems to be that the IR does not allow tagging reduction expressions.
From my perspective, the idea is that you turn any subexpression you would like to tag into a substitution rule (using, e.g. extract_subst
).
To be fair, no other type of subexpression can be individually tagged either.
There's no way to target the individual reduction nodes from the extract_subst
either. On applying: lp.extract_subst(knl, "subst", "x[arg0]", parameters=("arg0", ))
both the reductions are converted to a substitution.
I think it could be taught.
extract_subst(knl, "sum(i, *)")