einops
einops copied to clipboard
Why differently shaped tensors can't be concatenated with rearrange?
Currently, concatenation as in the example is done by calling stack_on_zeroth_dimension()
first then rearranging the tensor into the appropriate shape. However, most backend.stack()
requires that all except the stacked dimension to be the same, so simple concatenation of a dimension with different lengths is not possible.
For example, if we were to stack an image with 3 channels with an image with a single channel to create a 4-channel image:
img1 = np.random.randn(300, 200, 3)
img2 = np.random.randn(300, 200, 1)
np.concatenate([img1, img2], axis=2).shape
# (300, 200, 4) as expected
rearrange([img1, img2], 'b w h c -> w h (b c)')
# np.stack error: all input arrays must have the same shape
I would be ideal if such cases occurs, concatenation methods like np.concatenate
or torch.cat
is called instead of stack
. I am not sure how this might break the simplicity of the rest of the code.
Hi, @Mithrillion, and thanks for taking time to report.
It indeed may look like a bug, but demanding same shape of arguments is how it should work to make einops
uniform.
Some examples to explain what I mean.
# suppose this works like concatenation, so it concatenates along the channel
x = rearrange([img1, img2], 'b w h c -> w h (b c)')
# this looks like the inverse, but it returns two images of same shape
img1, img2 = rearrange(x, 'w h (b c) -> b w h c', b=2)
# if this is concatenation
x = rearrange([img1, img2], 'b w h c -> w h (b c)')
# this seems to make no sense at all
x = rearrange([img1, img2], 'b w h c -> w h (c b)')
# even harder to understand what could this mean. What are requirements for c,b,h?
x = rearrange([img1, img2], 'b w h c -> w (c b h)')
So, concatenation is meant only to work with arrays of similar shape (at least, until the way to avoid inconsistencies is found).
I'll leave this issue as open for reference so others could find it easily.
What if there were a concatenate operation with syntax something like:
concatenate([img2, img2], 'x w h c, y w h c -> (x + y) w h c')
I would find this very satisfying to use.
Hi @simonalford42 , see this comment https://github.com/arogozhnikov/einops/issues/56#issuecomment-962584525 There is no ETA for this feature
Update regarding concatenation: einops just got pack
for better concatenation and unpack
for times better splits