Baichuan2 icon indicating copy to clipboard operation
Baichuan2 copied to clipboard

transformers版本问题

Open wanghanbinpanda opened this issue 1 year ago • 7 comments

在使用transformers 4.35.0版本时,会产生错误:AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'. 切换到4.30.0版本时可以正常运行~

wanghanbinpanda avatar Nov 10 '23 20:11 wanghanbinpanda

切换到4.34.0也可以正常运行。

snowpalm avatar Nov 15 '23 06:11 snowpalm

降到4.34以下

zRzRzRzRzRzRzR avatar Nov 18 '23 12:11 zRzRzRzRzRzRzR

貌似baichuan2开发时用的是 4.31.0

xyjigsaw avatar Nov 23 '23 02:11 xyjigsaw

我这里是 Python 3.11.2 transformers 4.34.0 不行

wencan avatar Dec 24 '23 03:12 wencan

python 3.10.10 pytorch(gpu) 2.1.2 transformers 4.30.0可以,4.34.0还是一样报错

zzwtop1 avatar Jan 12 '24 03:01 zzwtop1

4.36.1 also have this problem, is there a way to specify a tokenizer file out side of the folder?

jijivski avatar Jan 14 '24 03:01 jijivski

BaichuanTokenizer类里修改一下,他在没有load sp_model的情况下调用了sp_model

self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
self.sp_model.Load(vocab_file)
super().__init__(
    bos_token=bos_token,
    eos_token=eos_token,
    unk_token=unk_token,
    pad_token=pad_token,
    add_bos_token=add_bos_token,
    add_eos_token=add_eos_token,
    sp_model_kwargs=self.sp_model_kwargs,
    clean_up_tokenization_spaces=clean_up_tokenization_spaces,
    **kwargs,
)
self.vocab_file = vocab_file
self.add_bos_token = add_bos_token
self.add_eos_token = add_eos_token

leoMesss avatar Apr 03 '24 10:04 leoMesss