XLS optimizer can maybe transform sequential multiplies into adds
For array updates written in DSLX like
let a = for(i, o): (u32, u32[5]) in range(u32:0, u32:5) {
val = i * x
update(o, i, val)
} ( u32[5]:[0, ...]);
which initializes an array to [0, x, 2x, 3x, 4x, 5x]. The corresponding IR becomes something like
val0 = 0
val1 = x
val2 = umul(x, 2)
val3 = umul(x, 3)
val4 = umul(x, 4)
a = array_update(a, 0, val0)
a = array_update(a, 1, val1)
a = array_update(a, 2, val2)
a = array_update(a, 3, val3)
depending on the cost of the multiplies vs adds vs whether or not this is part of the critical path of the block, this can be instead realized by additions alone, transforming the parallel multiplies into a chain of adds.
val0 = 0
val1 = x
val2 = val1 + x
val3 = val2 + x
val4 = val3 + x
a = array_update(a, 0, val0)
a = array_update(a, 1, val1)
a = array_update(a, 2, val2)
a = array_update(a, 3, val3)
This issue is to enhance XLS to be able to perform that analysis and choose the alternative structure when it makes sense.
This issue could also be addressed by understanding the structure of multiplies as a sequence of shifts and adds and then performing some sort of datapath synthesis on the result in order to generate a CSA tree that in parallel computes val2, val3, and val4.