einops icon indicating copy to clipboard operation
einops copied to clipboard

[Feature Request] functions on elements of 1 dimension: reorder (concatenate), and chunk

Open davidnvq opened this issue 3 years ago • 8 comments

Thank you for making our life easier when working with tensors. I have the following suggestions based on #50 and #20.

A. Reorder and concatenation of items of different shapes

A.1 Reorder elements of 1 dimension

As suggested in #50, it is indeed useful when we have an operation for reordering the elements of channels, especially for those working on images with different libraries (open-cv, PIL). It is really better than doing with boring indices.

I totally agree with @remisphere that we can use reorder without misleading to users.

# instead of doing this
out = imgs[:, [2, 0, 1, 3], :, : ]
# we can use the below
einops.reorder(imgs, 'batch [rg b a -> b rg a] h w', rg=2, b=1, a=1)

A.2 Concatenation of items of different sizes on 1 dimension

Since we only perform operations on the single dimension, we can perform the concatenation of multiple items with different sizes on that dimension. This will easily handle the case mentioned in #20 and extremely useful for those who use concatenate in their code. I use this function many times to concatenate tensors of different shapes. For example:

# three below tensors have different size on the 2nd dim
print(x.shape) # [b, 10]
print(y.shape) # [b, 15]
print(z.shape) # [b, 20]

# we can concatenate them as
inputs = [x, y, z]
out = einops.reorder(inputs, 'batch [x y z -> x y z]', x=10, y=15, z=20)

The above call is consistent with einops.rearrange to concatenate inputs including items of the same shape.

It is possible to split out into their components x, y, z with three lines using the below chunk function:

x = einops.chunk(out, 'batch [x yz -> x]', x=10)
y = einops.chunk(out, 'batch [x y z -> y]', x=10, y=15)
z = einops.chunk(out, 'batch [xy z -> z]', z=20)

B. Chunking along 1 dimension

In contrast with #50, I don't think it is a good idea to merge chunking into reorder. We can separate these functionalities into the above reorder and chunk. Chunking is used frequently when we want to sample parts of datasets and features.

Example in #50:

# remove the alpha channel and the bottom half of 256*256 images:
einops.chunk(imgs, 'batch [rg b a -> b rg] [top bottom -> top] w', rg=2, b=1, top=128, batch=10)

Split dataset into train and val

train_len = int(len(dataset) * 0.8)
train_split = einops.chunk(dataset, '[train val -> train] c h w', train=train_len)
val_split = einops.chunk(dataset, '[train val -> val] c h w', train=train_len)

And we can get the full dataset given train_split and val_split:

dataset = einops.reorder([train_split, val_split], '[train val -> train val] c h w', train=len(train_split), val=len(val_split))

davidnvq avatar Aug 20 '20 10:08 davidnvq

We can also add the name of that dimension for informative details such that the arguments inside square bracket belong to the dimension named right before.

For example:

reordering

# perform [rg b a -> b rg a] on `c` dimension
einops.reorder(imgs, 'batch c [rg b a -> b rg a] h w', rg=2, b=1, a=1)

concatenation

inputs = [x, y, z]
# concatenate along `feat` dimension.
out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

chunk

train_len = int(len(dataset) * 0.8)
# split/chunk along `batch` dimension
train_split = einops.chunk(dataset, 'batch [train val -> train] c h w', train=train_len)
val_split = einops.chunk(dataset, 'batch [train val -> val] c h w', train=train_len)

It's up to you to remove the space between the square brackets and its dimension, i.e., feat [x y z -> x y z] -> feat[x y z -> x y z].

davidnvq avatar Aug 20 '20 10:08 davidnvq

@davidnvq great writing, thanks for ideas and examples.

Big pro of your suggestion

  • chunking is verbalized
  • patterns are concise
  • everything is uniquely interpretable

These are issues I see: Concatenate looks like it does nothing (left and right parts are identical, but inputs and outputs are different)

out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

Second thing is conflict with existing notation: 'x y z' looks like 3d tensor (according to existing operations). This is easy to fix by e.g writing x+y+z.

arogozhnikov avatar Aug 20 '20 18:08 arogozhnikov

@davidnvq I'm also poking around the same issues (#20, #50), and pre-testing some concepts. I'll try to post some thoughts on how that can be written soon (but no promises!)

arogozhnikov avatar Aug 20 '20 19:08 arogozhnikov

These are issues I see: Concatenate looks like it does nothing (left and right parts are identical, but inputs and outputs are different)

out = einops.reorder(inputs, 'batch feat [x y z -> x y z]', x=10, y=15, z=20)

Second thing is conflict with existing notation: 'x y z' looks like 3d tensor (according to existing operations). This is easy to fix by e.g writing x+y+z.

I'm sorry. In your tutorial, concatenate can be done by x y -> (x y). I just missed a kind of indicator like (). Yeah, x+y+z is a simple and easy-to-understand way. I'm looking forward to your new release. If you need any support from communities, please kindly create some issues that we can help.

davidnvq avatar Aug 21 '20 01:08 davidnvq

Sorry to ask. Is davidnvq/einops#1 too old to be merged, or there are other ways to workaround? It's quite awkward where most of the work can be done using einops, but a native framework tool is needed for tasks like this O-o.

p4perf4ce avatar Nov 04 '21 17:11 p4perf4ce

@p4perf4ce I was just about to say the same thing. If einops had concat my code would finally be framework independent and super neat.

mashrurmorshed avatar Nov 05 '21 07:11 mashrurmorshed

It always bugging me to just use torch.cat when most of the tensor manipulation was done using einops. Making the code to be less semantic. Like when you need to concatenate the tensor from the different branches or implement skip connections where sometimes the number of channels doesn't match.

However, some of the functionalities must be discussed. For example, take a look at the concatenation proposed by @davidnvq in his fork:

# From davidnvq/einops/pull/1 
>>> x = torch.randn(2, 10, 512)
>>> y = torch.randn(2, 10, 128)
>>> z = torch.randn(2, 10, 256)
>>> h = concat([x, y, z], "batch seq [dx dy dz -> (dx dy dz)]", batch=2, seq=10, dx=512, dy=128, dz=256)

I think we don't actually need to separate concat from the rearrange since the number of dimensions of the input doesn't change at all.

# Suppose tensors x, y, z share the same structure, except for the axis that needs to be concatenated.
# We denoted a variable-length axis by using square bracket `[]`
# (Single pair of [] is allowed in each side of the pattern)
# E.g., x, y, z > (B, H, W, [C])
>>> h = rearrange([x, y, z], "B H W [Cx Cy Cz] -> B H W [Cx+Cy+Cz]")  # Like how we stack the list of tensors 
                                                                      # of the same shape, but uses `[]` as
                                                                      # a variable-length trigger and 
                                                                      # `+` for concatenation. 
                                                                      # Thanks to @arogozhnikov for `+`
>>> h.shape
(B, H, W, Cx+Cy+Cz)

# Additionally, this syntax allow us to split the tensor.
>>> batch_group = rearrange(h, "B H W [C1+C2] -> B H W [C1 C2]", C1=Cx, C2=Cy+Cz)
>>> batch_group.shape
((B, H, W, Cx), (B, H, W, Cy+Cz))

# While these do practically nothing. Unlike `()`, where we can use to stack the list of tensors
>>> [a, b, c] = rearrange([x, y, z], "B H W [Cx Cy Cz] -> B H W [Cx Cy Cz]") # a==x, b==y, c==z
>>> H = rearrange(h, "B H W [Cx+Cy+Cz] -> B H W [Cx+Cy+Cz]") # H==h

# The following is although looking strange, but still sensible. As long as the numbers add up.
# Here, we can still treat `Cb+Cc` from the left-hand side as a named variable
# where we need to evaluate the validity of `Cb+Cc`.
# This is particularly useful where we may use some black-box function/method that promise to do concatenation 
# but we want to make sure the output is valid.
>>> s = someBlackBoxMethod(b, c)
>>> K = rearrange(
    [a, s], "B H W [Ca Cb+Cc] -> B H W [Ca+Cb+Cc]",
    Ca=b.shape[-1], Cb=b.shape[-1], Cc=c.shape[-1]
)

However, the proposed syntax shouldn't allow you to do any other operation other than concatenation on the variable-length axis and swapping the axis around.

# This make no sense.
>>>rearrange([x, y, z], "B H W ([Cx Cy Cz] K) -> B H W [Cx+Cy+Cz] K") 
# This make senses.
# Also, it wouldn't matter whether you rearrange each tensor then rearrange, or concatenate then rearrange
>>>rearrange([x, y, z], "B (h1 h2) W [Cx Cy Cz] -> B h1 h2 [Cx+Cy+Cz] W", h1=N, h2=M) 

For this proposed syntax:

# From davidnvq/einops/pull/1 
###  Example: concatenate
>>> h = concat([x, y, z], "batch seq [... -> ...]")
>>> h = concat([x, y, z], "batch seq [... -> d]")

# If we change into the usual rearrange syntax
>>> h = concat([x, y, z], "batch seq [...] -> batch seq [d]") # Make sense, but looks like reduction.
>>> h = rearrange([x, y, z], "batch seq [...] -> batch seq [...]") # What actually does this mean?
# Is it should be interpreted as doing nothing or concatenate?

I personally think this ellipsis style violated the purposed of being strictly determined. Because I can't immediately infer what's actually going on here. If it is going to be that much of tensor, like 10 or more. We can work around like this:

>>> h = [x, y, z]
>>> dim_var = [f'_{i}' for i in range(len(tobe_concat))]
>>> sep_dim = ' '.join(dim_var)
>>> cat_dim = '+'.join(dim_var)
>>> h = rearrange(h, f"batch seq [{sep_dim}] -> batch seq [{cat_dim}]")

While it's a bit messy. But still limiting tensor manipulation to just string manipulation.

Alternatively, we may be just doing this...

# This is allowed as long as there is only one variable-length axis.
# Also, no named variable got dropped in the right-hand side.
# Note: `+` is not allowed in this case, where the number of named vars in `[]`
# doesn't match the number of elements in the list.
>>> h = rearrange([x, y, z], "B H W [C] -> B H W C") 

Thanks @arogozhnikov for this wonderful tool, liberating us from the cross-framework tensor manipulation, and the '+' advice above. Would love to know your thought on this.

p4perf4ce avatar Nov 05 '21 16:11 p4perf4ce

@p4perf4ce thanks for thinking about that loudly with examples.

I've been poking around with an operation semantic (I've dubbed in rechunk), it has some overlap with your suggestion.

One critical choice: how to specify which axis is modified?

When both input and output are represented in full shape (as in your suggestion) - packs too much into a single operation, and does not focus on axis. It is unclear what a user should focus on.

In my experiments I've landed on a very similar "list" in pattern and list as input/output. In your suggestion there is an exceptional case of single element in a list, and introduction of special cases should be avoided.

I have converged on:

[result] = rechunk([a, b, c], '[x,y,z] -> [x+y+z]', axis='b h w *')

It is possible to also support something like:

result = rechunk([a, b, c], '[x,y,z] -> x+y+z', axis='b h w *')

...but static code analysis would get crazy - easier to just always input and output lists.

Problem with an arbitrary number of inputs is a hard one. I've had something like this:

[result] = rechunk([a, b, c], '[*x] -> [concat(x)]', axis='b h w *')

... too complex, no need for this flexibility. Following could completely cover all necessary cases.

[result] = rechunk([a, b, c], 'concatenate', axis='b h w *')

There is a natural requirement to have an "inversion" to concatenation (which can properly work only if pattern contains information about a single axis).

I can post more detailed RFC with suggestion if that's something interesting to discuss, but I'll not be able to dedicate time for implementing/supporting.

arogozhnikov avatar Nov 07 '21 10:11 arogozhnikov

Would really love this feature!

austinmw avatar Oct 14 '22 18:10 austinmw

Thanks for the discussion folks!

Brand new einops.pack and einops.unpack cover common cases for concatenate and chunk, so closing this

https://github.com/arogozhnikov/einops/blob/master/docs/4-pack-and-unpack.ipynb

arogozhnikov avatar Nov 08 '22 22:11 arogozhnikov