Baichuan2
Baichuan2 copied to clipboard
transformers版本问题
在使用transformers 4.35.0版本时,会产生错误:AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'. 切换到4.30.0版本时可以正常运行~
切换到4.34.0也可以正常运行。
降到4.34以下
貌似baichuan2开发时用的是 4.31.0
我这里是 Python 3.11.2 transformers 4.34.0 不行
python 3.10.10 pytorch(gpu) 2.1.2 transformers 4.30.0可以,4.34.0还是一样报错
4.36.1 also have this problem, is there a way to specify a tokenizer file out side of the folder?
BaichuanTokenizer类里修改一下,他在没有load sp_model的情况下调用了sp_model
self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
self.sp_model.Load(vocab_file)
super().__init__(
bos_token=bos_token,
eos_token=eos_token,
unk_token=unk_token,
pad_token=pad_token,
add_bos_token=add_bos_token,
add_eos_token=add_eos_token,
sp_model_kwargs=self.sp_model_kwargs,
clean_up_tokenization_spaces=clean_up_tokenization_spaces,
**kwargs,
)
self.vocab_file = vocab_file
self.add_bos_token = add_bos_token
self.add_eos_token = add_eos_token