circt icon indicating copy to clipboard operation
circt copied to clipboard

[Arc] Allow uniform operands to the VectorizeOp

Open fabianschuiki opened this issue 9 months ago • 0 comments

The arc.vectorize op currently has no way of capturing if some of the vector lanes all have the exact same input operand. This is important for vectorizing arc.state ops which have a clock, enable, and reset that all must be identical in order to be able to vectorize the op.

For example,

arc.vectorize (%0, %1), (%2, %3), (%clk, %clk) {
^bb0(%arg0, %arg1, %arg2):
  arc.state @Foo(%arg0, %arg1) clock %arg2
}

would vectorize into (psuedo-ops)

%v0 = vector_create %0, %1
%v1 = vector_create %2, %3
arc.state @FooVec(%v0, %v1) clock %clk

where the inputs to the arc @Foo can be vectorized, because we can vectorize the entire arc definition as @FooVec, but the clock %clk has to remain a scalar. This is only possible if all vector lanes used the same clock %clk.

It might be useful to capture the uniformity of %clk as part of the vectorize op itself:

arc.vectorize (%0, %1), (%2, %3) uniform %clk {
^bb0(%arg0, %arg1, %arg2):
  arc.state @Foo(%arg0, %arg1) clock %arg2
}

This makes lowering easier, since the vectorize op allows for its operands to be vectorized at a different point in time than its body. Without the uniform operands, the block arguments of the body would vectorize into something like %arg0: vector<2 x i42>, %arg2: vector<2 x i42>, %arg3: vector<2 x !seq.clock>. However, the arc.state op requires a scalar clock %arg2, so the lowering would not be possible. However, with the uniform operands, it would be clear that %arg2 is a single uniform !seq.clock, which would allow the vectorization of the body to occur.

To implement this, @maerhart had suggested that we could have an additional group of variadic operands that contains all the uniform operands. For example:

arc.vectorize (a, b), (c, d), (e, f) {
^bb0(%vAB, %vCD, %e, %f):
  ...
}

The last operand group (e, f) is understood to be the uniform operands. It can be empty if there are no uniform operands. The body block would have one block argument for the first group (a, b), one block argument for the second group (c, d), and then block arguments for each operand in the final group (e, f).

fabianschuiki avatar May 21 '24 17:05 fabianschuiki