BELLE icon indicating copy to clipboard operation
BELLE copied to clipboard

原版LLaMA对中文的支持非常有限,BELLE有做中文词表扩充吗

Open sunzhaowei opened this issue 1 year ago • 1 comments

据说原版 LLaMa的tokenizer只支持700多个中文

sunzhaowei avatar Apr 06 '23 16:04 sunzhaowei

如果原版只支持700多个,那这个肯定扩充了的,我试了下效果还可以

Tongjilibo avatar Apr 10 '23 15:04 Tongjilibo

https://arxiv.org/pdf/2304.07854.pdf

xianghuisun avatar Aug 01 '23 11:08 xianghuisun