loopy AssertionError when calling add_prefetch with SubArrayRefs

Attempting to add a prefetch to a kernel with a SubArrayRef leads to an assertion error. Some mapper is attempting to create a new SubArrayRef but passes None for both swept_inames and subscript.

Reproducer:

import loopy as lp

child_knl = lp.make_function(
    "[N] -> {[i]: 0<=i<N-1}",
    """
    g[i] = f[i] + f[i+1]
    """, [...], name="func")

knl = lp.make_kernel(
    "[N] -> {[i]: 0<=i<N-1}",
    """
    [i]: g[i] = func([i]: f[i])
    """,
    [
        lp.GlobalArg("f", shape=("N",)),
        lp.GlobalArg("g", shape=("N",)),
        ...
    ],
    options=lp.Options(write_cl=True),
    )

knl = lp.merge([knl, child_knl])
knl = lp.split_iname(knl, "i", 32, outer_tag="g.0", inner_tag="l.0")
knl = lp.add_prefetch(knl, "f", "i_inner")

Same behavior if I first merge knl with one defining func.

Apologies if I'm misunderstanding/misusing things - flying a bit blind here. (If there happen to be examples for this type of usage anywhere, I'd be glad for them - couldn't find any in, e.g., test_callables.py.)

Jun 23 '21 15:06 zachjweiner

Hello @zachjweiner! There are a couple of issues here:

The issue you point out regarding prefetch over sub-array-refs is a bug.
There isn't a good shape-inference support at a call-site yet. So, solving (1) might not be fruitful yet, as I would expect the value of "N" in the callee (func) would also have to be updated as we update the sub-array region passed in.

Jun 23 '21 17:06 kaushikcfd

Thanks, @kaushikcfd! I'm still wrapping my head around the callables "model" at the moment.

In practice, all I'm actually looking to do (for now) is prefetch over inames that the callee isn't aware of - something like

child_knl = lp.make_function(
    "{:}",
    """
    g = f[0] + f[1]
    """, name="func")

knl = lp.make_kernel(
    "{[i] : 0 <= i < 32}",
    """
    g[i] = func(f[:, i])
    """,
    [
        lp.GlobalArg("f", shape=(2, 32)),
        lp.GlobalArg("g", shape=(32,))
    ],
    options=lp.Options(write_cl=True),
    )

knl = lp.split_iname(knl, "i", 4, outer_tag="g.0", inner_tag="l.0")
knl = lp.add_prefetch(knl, "f", "i_inner")

knl = lp.merge([knl, child_knl])

If the bug (1) were fixed, would this use case be able to skirt the shape inference issues? I imagine yes if one inlined all called kernels (or redirected inputs via temporaries)?

Jun 23 '21 18:06 zachjweiner

loopy loopy copied to clipboard

AssertionError when calling add_prefetch with SubArrayRefs

loopy
loopy copied to clipboard