celerity-runtime
celerity-runtime copied to clipboard
RFC: Implement access::components higher-order range mapper
RFC, based on #182.
A common parallelization pattern, e.g. in dense matrix-vector products, is to map 1D thread-ids to the rows of a 2D matrix while iterating over all columns on each item. This currently requires a custom range mapper:
const auto rows = [](const chunk<1> &chnk, const range<2> &buffer_size) {
return subrange<2>{
{chnk.offset[0], 0},
{chnk.range[0], buffer_size[1]}
};
};
While we could just have access::rows
and access::columns
convenience range mappers with all the confusion about row-major vs column-major vs fastest-dimensions that would entail, I have come up with a generic solution for the entire class of these "component mappings".
Let me introduce the first higher-order range mapper, access::components
. It constructs a (chunk<KernelDims>, range<BufferDims>) -> subrange<BufferDims>
range mapper from BufferDims
individual mappers of type (chunk<KernelDim>, range<1>) -> subrange<1>
.
Together with the new, straight-forward mapper access::kernel_dim
that creates a subrange<1>
from a single kernel dimension, this allows us to re-write the custom range mapper above like so:
const auto rows = access::components(access::kernel_dim(0), access::all());
Note that any other range mapper that produces subrange<1>
can be used for each component, such as fixed
~~or neighborhood
~~.
Check-perf-impact results: (9e900a306dc9f17e4a27439205a7680c)
:question: No new benchmark data submitted. :question:
Please re-run the microbenchmarks and include the results if your commit could potentially affect performance.
Live bikeshedding update about the naming of access::kernel_dim
:
- All other built-in range mapperns describe a pattern while
kernel_dim
describes a function. That works within thecomponents
"DSL" but looks out-of-place as a stand-alone RM. - @PeterTh suggested we could extend
one_to_one
to optionally take adim
parameter and then act aschunk<Dims> -> subrange<1>
, because on a high level we think of therows
RM from the OP to act like aone_to_one
in the first dimension and like anall
in the second one. However this breaks again when regarding this specialization as a standalone RM, where achunk<2> -> subrange<1>
with that definition is an n-to-one, not a one-to-one. - Naming proposals: @fknorr
access::linear
,access::line
oraccess::interval
.
More shower thoughts from my side:
- Since this must inspect the RM result subranges for each component, it might interfere with our goal of allowing RMs to return regions in the future.
-
neighborhood
unfortunately cannot be used in a component since input and output dimensions must match. - Although "higher order range mappers" feel very neat conceptually, the few relevant component mappings (
kernel_dim
,fixed
,all
) could also be expressed through purpose-built tag types that lead to a less surprising interface.