MiniGPT-4
MiniGPT-4 copied to clipboard
Asking to pad but the tokenizer does not have a padding token
Second finetuning stage
CUDA_VISIBLE_DEVICES=2 python3 train.py --cfg-path train_configs/minigpt4_stage2_finetune.yaml
error
Traceback (most recent call last):
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/train.py", line 104, in <module>
main()
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/train.py", line 100, in main
runner.train()
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/runners/runner_base.py", line 378, in train
train_stats = self.train_epoch(cur_epoch)
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/runners/runner_base.py", line 438, in train_epoch
return self.task.train_epoch(
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/tasks/base_task.py", line 114, in train_epoch
return self._train_inner_loop(
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/tasks/base_task.py", line 219, in _train_inner_loop
loss = self.train_step(model=model, samples=samples)
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/tasks/base_task.py", line 68, in train_step
loss = model(samples)["loss"]
File "/home/ocr/anaconda3/envs/minigpt4/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ocr/projects/llm/MiniGPT-4/MiniGPT-4/minigpt4/models/mini_gpt4.py", line 181, in forward
to_regress_tokens = self.llama_tokenizer(
File "/home/ocr/anaconda3/envs/minigpt4/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2538, in __call__
encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
File "/home/ocr/anaconda3/envs/minigpt4/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2624, in _call_one
return self.batch_encode_plus(
File "/home/ocr/anaconda3/envs/minigpt4/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2806, in batch_encode_plus
padding_strategy, truncation_strategy, max_length, kwargs = self._get_padding_truncation_strategies(
File "/home/ocr/anaconda3/envs/minigpt4/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2443, in _get_padding_truncation_strategies
raise ValueError(
ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})`.
NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7
GPU A100 80G