replace a lot of the confusing conv reshapes, etc... with einops
Forgive my ignorance but I'm not really sure what this issue means.
In conv2d the line:
ret = (x * weight.reshape(1, groups, rcout, 1, 1, cin, H, W)).sum((-3, -2, -1)).reshape(bs, cout, oy, ox)
is equivalent(*1) in numpy terms to:
ret= np.einsum('ngcyxkji,gckji->ngcyx', x.numpy(), weight.reshape(groups,rcout,cin,H,W).numpy()).reshape(bs,cout,oy,ox)
Is this what is meant by "einops"?
If so I imagine that shoving the result of the above into a tensor wouldn't maintain the gradient. Does tinygrad have an einsum op? This issue and another issue are the only places that github search surfaces the usage of the word "einop". The word einsum appears in the implementation of MULACC but I don't think thats it. If it is could you please explain the name? It's not mentioned in any part of the code (again according to github search feature) unless you count the A * A -> B in the readme, which doesn't suggest einsum or something like it.
(*1): The result is close with random inputs but a lot of entries are off by ~1e-6. I'm assuming that this is due to either the use of non-deterministic algorithms or something negligible like that.
It's easy to maintain a gradient through einops. If you write it in tinygrad, it will be both fast and have a gradient :)
Hi, I would like to work on this. Is it okay?
Please do, I still don't get what is meant by "einops".
Please do, I still don't get what is meant by "einops".
It is a library for readable and flexible tensor operations: https://github.com/arogozhnikov/einops
Thank you for clarifying.
Hello @geohot, @sixChar, and @rabiaedayilmaz, To shed some light on the concept, here's a rough idea of how you might refactor the convolution code snippet with Einops:
from einops import rearrange, reduce
# assume `x` and `weight` are your input tensors
x = rearrange(x, 'bs cin (h kh) (w kw) -> bs cin h w kh kw', kh=H, kw=W)
weight = rearrange(weight, 'rcout cin H W -> rcout cin H W')
# now we perform the convolution in a more readable way
ret = reduce(x * weight, 'bs rcout h w kh kw -> bs rcout h w', 'sum')
As you can see, Einops provides a clear, readable way to reshape and reduce tensors, enhancing the maintainability and readability of the codebase.
Anybody taking this up ?