RWKV-LM icon indicating copy to clipboard operation
RWKV-LM copied to clipboard

BPE Tokenizer

Open wannaphong opened this issue 9 months ago • 2 comments

Is it possible use bpe tokenizer instead rwkv_vocab_v20230424 in the next model?

I tried rwkv model in Thai language. It look good but it is very slow because Thai is character level for rwkv_vocab_v20230424.

I think if the next model use bpe tokenizer like qwen2, It can improve model and the speed.

wannaphong avatar Jan 28 '25 06:01 wannaphong