RWKV-LM
RWKV-LM copied to clipboard
Implementation details about wkv
Amazing work! But I'm really confused about the implementation details for wkv cuda kernel (in RWKV-LM\RWKV-v4neo\cuda\wkv_cuda.cu). How does the implementation match the equations shown in README? Could you please give a more detailed comment about it? For example, what is the meaning of local variable p, pp, ... Thanks