how-to-train-tokenizer icon indicating copy to clipboard operation
how-to-train-tokenizer copied to clipboard

怎么训练一个LLM分词器

Results 2 how-to-train-tokenizer issues
Sort by recently updated
recently updated
newest added

hi, thanks for your share. could you tell me where you downloaded these dataset? thanks.

你好,我用该项目训练中sentencepiece训练了一个中文词表,和falcon的英文词表无法合并,使用AutoTokenizer加载的falcon英文词表,没有sp_model属性,请问该怎么解决呢