Samuel Ainsworth
Samuel Ainsworth
Hi @frallebini! The writeup in the paper is for the special case of an MLP with no bias terms -- the version in the code is just more general. The...
> If the moveaxis-reshape-@ operation corresponded to the Frobenius inner product with w_a, wouldn't A be a scalar? Ack, you're right! I messed up: it's not actually a Frobenius inner...
Hmm I think the error here is in the first line of (2): The shapes here don't line up since $W_\ell^A$ has shape `(n, *)` and $W_{\ell+1}^A$ has shape `(*,...
That's correct! In addition, it's necessary when dealing weight arrays of higher shapes as well, eg in a convolutional layer where the weights have shape `(w, h, channel_in, channel_out)`.
Hi @LeCongThuong, `ps.perm_to_axes` is a dict of form `PermutationId => [(ParamId, Axis), ...]` where in this case `PermutationId`s are strings, `ParamId`s are also strings, and `Axis`s are integers. So for...
Hi Ricard! Yes, unfortunately this code was written when pytorch was still pre-v1.0. I'm not aware of any other implementations, but I'd be happy to help you start any new...
Pyro sounds awesome! Let me know how it goes!
Hi @ogkalu2! The exact framework that the model runs in is not super important. For example, we have used our JAX code to align the weights of two PyTorch models...
This looks to be an error in your permutation spec, I would try debugging what weight arrays that's occurring on.
Hmm, I'm not familiar with the stable diffusion architecture... is there a reason not to model permutations on the betas? You could always add them to you `PermutationSpec` and just...