mm-cot icon indicating copy to clipboard operation
mm-cot copied to clipboard

OverflowError: out of range integral type conversion attempted

Open 1-sf opened this issue 1 year ago • 4 comments

I get the following error:

You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding. 100% 1061/1061 [2:04:03<00:00, 3.45s/it]Traceback (most recent call last): File "/content/mm-cot/main.py", line 395, in T5Trainer( File "/content/mm-cot/main.py", line 284, in T5Trainer metrics = trainer.evaluate(eval_dataset = test_set, max_length=args.output_len) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 159, in evaluate return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3043, in evaluate output = eval_loop( File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3343, in evaluation_loop metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels)) File "/content/mm-cot/main.py", line 215, in compute_metrics_rougel preds = tokenizer.batch_decode(preds, skip_special_tokens=True, clean_up_tokenization_spaces=True) File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3469, in batch_decode return [ File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3470, in self.decode( File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3509, in decode return self._decode( File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py", line 546, in _decode text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens) OverflowError: out of range integral type conversion attempted

when I run the inference for rationale generation

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py \
  --data_root data/ScienceQA/data \
  --caption_file data/instruct_captions.json \
  --model declare-lab/flan-alpaca-large \
  --user_msg rationale --img_type vit \
  --bs 2 --eval_bs 4  --epoch 50 --lr 5e-5 --output_len 512 \
  --use_caption --use_generate --prompt_format QCM-E \
  --output_dir experiments \
  --evaluate_dir models/mm-cot-large-rationale

This happens after those 1061 iterations are completed. As a consequence it doesn't generate experiments/rationale_declare-lab-flan-alpaca-large_vit_QCM-E_lr5e-05_bs8_op512_ep50/predictions_ans_eval.json which is expected by answer inference phase for inference

1-sf avatar Dec 31 '23 18:12 1-sf

I have the same problem.

I suspect it's an error caused by the tokenizer not being able to decode it, as preds contain a value of -100, It is related to this issue (https://github.com/huggingface/transformers/issues/22634)

SanghyeokSon avatar Jan 14 '24 04:01 SanghyeokSon

I tried https://github.com/huggingface/transformers/issues/24433#issuecomment-1764248213 and it seems to have worked

1-sf avatar Jan 14 '24 05:01 1-sf

This issue may be due to the update of the transformers library. The solution above seems to be effective.

cooelf avatar May 19 '24 06:05 cooelf