tinygrad replace a lot of the confusing conv reshapes, etc... with einops

Feb 28 '23 21:02 geohot

Forgive my ignorance but I'm not really sure what this issue means.

In conv2d the line: ret = (x * weight.reshape(1, groups, rcout, 1, 1, cin, H, W)).sum((-3, -2, -1)).reshape(bs, cout, oy, ox) is equivalent(*1) in numpy terms to: ret= np.einsum('ngcyxkji,gckji->ngcyx', x.numpy(), weight.reshape(groups,rcout,cin,H,W).numpy()).reshape(bs,cout,oy,ox)

Is this what is meant by "einops"?

If so I imagine that shoving the result of the above into a tensor wouldn't maintain the gradient. Does tinygrad have an einsum op? This issue and another issue are the only places that github search surfaces the usage of the word "einop". The word einsum appears in the implementation of MULACC but I don't think thats it. If it is could you please explain the name? It's not mentioned in any part of the code (again according to github search feature) unless you count the A * A -> B in the readme, which doesn't suggest einsum or something like it.

(*1): The result is close with random inputs but a lot of entries are off by ~1e-6. I'm assuming that this is due to either the use of non-deterministic algorithms or something negligible like that.

Mar 12 '23 01:03 sixChar

It's easy to maintain a gradient through einops. If you write it in tinygrad, it will be both fast and have a gradient :)

Mar 12 '23 02:03 geohot

Hi, I would like to work on this. Is it okay?

Apr 22 '23 10:04 rabiaedayilmaz

Please do, I still don't get what is meant by "einops".

Apr 22 '23 16:04 sixChar

Please do, I still don't get what is meant by "einops".

It is a library for readable and flexible tensor operations: https://github.com/arogozhnikov/einops

Apr 22 '23 16:04 rabiaedayilmaz

Thank you for clarifying.

Apr 22 '23 18:04 sixChar

Hello @geohot, @sixChar, and @rabiaedayilmaz, To shed some light on the concept, here's a rough idea of how you might refactor the convolution code snippet with Einops:

from einops import rearrange, reduce

# assume `x` and `weight` are your input tensors
x = rearrange(x, 'bs cin (h kh) (w kw) -> bs cin h w kh kw', kh=H, kw=W)
weight = rearrange(weight, 'rcout cin H W -> rcout cin H W')

# now we perform the convolution in a more readable way
ret = reduce(x * weight, 'bs rcout h w kh kw -> bs rcout h w', 'sum')

As you can see, Einops provides a clear, readable way to reshape and reduce tensors, enhancing the maintainability and readability of the codebase.

May 28 '23 09:05 hemangjoshi37a

Anybody taking this up ?

May 30 '23 14:05 jalotra