Pad ct-ct matmul inputs for bicyclic encoding
A continuation of https://github.com/google/heir/issues/1376
Some kernels, like ct-ct matmul, require the input dimensions have certain properties, such as being coprime.
Since different kernels have different requirements, the layout-optimizer needs to be able to adjust packings to compensate for this. In particular, if we choose a bicyclic matmul, then the adjusted layout needs to be zero-padded to coprime dimensions.
Ideally this could be done entirely in the layout descriptor, without having to actually zero-pad the data matrix. At first I thought you could do this by expressing a map injecting the data matrix into a space with a larger range, e.g., if we had a 4x4 matrix, an injection into a matrix of size (5, 7) would be given as
{ [row, col] -> [row', col'] : 0 <= row <= 3 and 0 <= col <= 3 and 0 <= row' <= 5 and 0 <= col' <= 7 and row' = row and col' = col }
But ISL simplifies this by tightening the range variables.
So now I think it would be easiest to just insert zero-padding ops on the data matrix in layout-optimization once we have decided the bicyclic matmul kernel is what we want.