RWKV
Here is an implementation of the RWKV code with annotations that I am completing with @Quentin-Anthony. https://arxiv.org/abs/2305.13048
We have been unable to build all the docs due to our lack of access to pylit. Could you either take care of the docs creation or could you open source your internal version of pylit so that we can create them and make sure they are correctly formatted?
Also, we have not finished the training loop implementation in line 136 of labml_nn/RWKV/experiment.py.
I will generate the HTML when you are ready.
Thanks for the contribution!
Alright this should be ready for review. Let us know if you need anything else here.
Sorry for the delay; I've been busy with work. I generated documentations and changed formatting a little.
The generated docs are here: https://nn.labml.ai/RWKV/
I feel a a little more comments will help? Let me know what you think and we can link it from the home page once it's ready.
Also, why do you have a custom LayerNorm implementation? Can we use Pytorch layernorm or the layernorm implemented here: https://nn.labml.ai/normalization/layer_norm/index.html
Thanks!