Wang hl
Wang hl
Amazing work! But I'm really confused about the implementation details for wkv cuda kernel (in RWKV-LM\RWKV-v4neo\cuda\wkv_cuda.cu). How does the implementation match the equations shown in README? Could you please give...
Hi, thanks for your work :) Now, I'm wondering how can I finetune RWKV given a pretrained model. I know that there is one repo (https://github.com/Blealtan/RWKV-LM-LoRA ) using LoRA for...
Great work! But when I use torch==2.0.0, I find that compilation for ViT fails. I get a warning: [2023-03-27 12:49:31,505] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: 'forward' (/opt/conda/lib/python3.8/site-packages/vit_pytorch/vit.py:19) reasons:...
### Background and motivation Hi, thanks for your work. But when I'm tring to migrate my PyTorch code to Oneflow code, I find that there are only few APIs in...
## Description ## Related Issue ## TODO ## Check List - [ ] merge the latest version source branch/repo, and resolve all the conflicts - [ ] pass style check...
## Description ## Related Issue ## TODO ## Check List - [ ] merge the latest version source branch/repo, and resolve all the conflicts - [ ] pass style check...
## Description ## Related Issue ## TODO ## Check List - [ ] merge the latest version source branch/repo, and resolve all the conflicts - [ ] pass style check...