RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

Create RWKV language model from config, not loading from file, without CUDA

Open James4Ever0 opened this issue 1 year ago • 1 comments

I saw some code under RWKV-LM/RWKV-v4neo/src/model.py which requires CUDA to create RWKV model.

I want to change the code by replacing the first embedding layer with a linear layer to fit my needs.

The code of rwkv.model.RWKV only allows me to load from existing model weights.

I want to know where or how I can create a new RWKV model from config, not from existing model weights, also how do I change the first layer of the model?

James4Ever0 avatar Apr 18 '23 02:04 James4Ever0

Maybe I didn't got your thoughts, but in train.py, the default config is creating new RWKV model rather than loading from existing model.

To training using your own dataset, You can just start with here

python train.py --load_model "" --wandb "" --proj_dir "out" \
     --data_file "./enwik8" --data_type "utf-8" --vocab_size 0 \
     --ctx_len 512 --epoch_steps 5000 --epoch_count 500 --epoch_begin 0 --epoch_save 5 \
     --micro_bsz 12 --n_layer 6 --n_embd 512 --pre_ffn 0 --head_qk 0 \
 --lr_init 8e-4 --lr_final 1e-5 --warmup_steps 0 --beta1 0.9 --beta2 0.99 --adam_eps 1e-8 \
     --accelerator gpu --devices 1 --precision bf16 --strategy ddp_find_unused_parameters_false --grad_cp 0

lantudou avatar Apr 19 '23 02:04 lantudou