INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg1hbjeku
INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg1hbjeku/_remote_module_non_scriptable.py
INFO:lightning_fabric.utilities.seed:Global seed set to 42
Traceback (most recent call last):
File "/home/cike/zzp/alpaca/chatglm_finetuning/data_utils.py", line 272, in
tokenizer, config, , = dataHelper.load_tokenizer_and_config(tokenizer_class_name=ChatGLMTokenizer,config_class_name=ChatGLMConfig)
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/deep_training/data_helper/data_helper.py", line 257, in load_tokenizer_and_config
tokenizer = load_tokenizer(tokenizer_name=tokenizer_name or model_args.tokenizer_name,
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/deep_training/data_helper/data_module.py", line 29, in load_tokenizer
tokenizer = class_name.from_pretrained(tokenizer_name, **tokenizer_kwargs)
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained
return cls._from_pretrained(
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/cike/zzp/alpaca/chatglm_finetuning/tokenization_chatglm.py", line 211, in init
self.sp_tokenizer = SPTokenizer(vocab_file)
File "/home/cike/zzp/alpaca/chatglm_finetuning/tokenization_chatglm.py", line 32, in init
self.text_tokenizer = self._build_text_tokenizer(encode_special_tokens=False)
File "/home/cike/zzp/alpaca/chatglm_finetuning/tokenization_chatglm.py", line 65, in _build_text_tokenizer
self._configure_tokenizer(
File "/home/cike/zzp/alpaca/chatglm_finetuning/tokenization_chatglm.py", line 61, in _configure_tokenizer
text_tokenizer.refresh()
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/icetk/text_tokenizer.py", line 31, in refresh
self.sp.Load(model_proto=self.proto.SerializeToString())
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/sentencepiece/init.py", line 904, in Load
return self.LoadFromSerializedProto(model_proto)
File "/home/cike/anaconda/envs/alpaca/lib/python3.9/site-packages/sentencepiece/init.py", line 250, in LoadFromSerializedProto
return _sentencepiece.SentencePieceProcessor_LoadFromSerializedProto(self, serialized)
RuntimeError: Internal: [MASK] is already defined.