I'm wondering to what extent shared weights are supported (such as in RNNs). I understand that I can hack something together by manually composing apply_fn together using shared params. However,...
I believe a more general treatment will consider repeated singular values. In that situation, it was useful for me to notice that unitary matrices form a group, so left multiplication...
### 🚀 The feature, motivation and pitch Hi, really appreciate the work that y'all are doing. Are y'all planning to release the checkpoint files for OLMo 7B? Right now the...