qlora icon indicating copy to clipboard operation
qlora copied to clipboard

OverflowError: out of range integral type conversion attempted while running python qlora.py

Open amdnsr opened this issue 1 year ago • 12 comments

python qlora.py --model_name_or_path decapoda-research/llama-13b-hf

(I have updated the tokenizer_config.json and config.json as per the various discussions here tokenizer_class: LlamaTokenizer and architectures: LlamaForCausalLM)

==================================================================================

adding LoRA modules... trainable params: 125173760.0 || all params: 6922327040 || trainable: 1.8082612866554193 loaded model

Using pad_token, but it is not set yet. Traceback (most recent call last): File "qlora.py", line 758, in train() File "qlora.py", line 620, in train "unk_token": tokenizer.convert_ids_to_tokens(model.config.pad_token_id), File "/home/envs/qlora_env/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 307, in convert_ids_to_tokens return self._tokenizer.id_to_token(ids) OverflowError: out of range integral type conversion attempted

amdnsr avatar May 25 '23 08:05 amdnsr

Running on Tesla V100 32GB GPU

amdnsr avatar May 25 '23 08:05 amdnsr

same issue

update

Change model.config.pad_token_id to 0 should fix this problem but may harm to training.

        tokenizer.add_special_tokens(
            {
                "eos_token": tokenizer.convert_ids_to_tokens(model.config.eos_token_id),
                "bos_token": tokenizer.convert_ids_to_tokens(model.config.bos_token_id),
                "unk_token": tokenizer.convert_ids_to_tokens(0),
            }
        )

MaticsL avatar May 25 '23 09:05 MaticsL

same issue

LIO-H-ZEN avatar May 25 '23 09:05 LIO-H-ZEN

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens.

https://github.com/artidoro/qlora/pull/20

Qubitium avatar May 25 '23 09:05 Qubitium

same issue

ricksun2023 avatar May 25 '23 16:05 ricksun2023

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens.

#20

this is solve my issue but I am getting maximum recursion depth error now.

File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token) RecursionError: maximum recursion depth exceeded

atillabasaran avatar May 25 '23 17:05 atillabasaran

Please check if this pr fixes your issue. The pr was designed to fix something else but will also bypass token conversion if tokenizer already contains the special tokens. #20

this is solve my issue but I am getting maximum recursion depth error now.

File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 250, in convert_tokens_to_ids     return self._convert_token_to_id_with_added_voc(tokens)   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 257, in _convert_token_to_id_with_added_voc     return self.unk_token_id   File "/home/atilla/miniconda3/envs/qlora/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1142, in unk_token_id     return self.convert_tokens_to_ids(self.unk_token) RecursionError: maximum recursion depth exceeded

same...

LIO-H-ZEN avatar May 26 '23 02:05 LIO-H-ZEN

Hi, I changed to huggyllama/llama-7b and applied the chanige #20. I avoided the above errors and now below*

Traceback (most recent call last):
  File "/Workspace/Repos/[email protected]/qlora/qlora.py", line 853, in <module>
    train()
  File "/Workspace/Repos/[email protected]/qlora/qlora.py", line 824, in train
    metrics = trainer.evaluate(metric_key_prefix="eval")
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_seq2seq.py", line 159, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer.py", line 3108, in evaluate
    self.control = self.callback_handler.on_evaluate(self.args, self.state, self.control, output.metrics)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_callback.py", line 379, in on_evaluate
    return self.call_event("on_evaluate", args, state, control, metrics=metrics)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-7966afd1-6600-45a5-a135-50b716d0345e/lib/python3.10/site-packages/transformers/trainer_callback.py", line 397, in call_event
    result = getattr(callback, event)(
  File "/Workspace/Repos/[email protected]/qlora/qlora.py", line 751, in on_evaluate
    refs += [abcd_idx.index(label) for label in labels.tolist()]
  File "/Workspace/Repos/[email protected]/qlora/qlora.py", line 751, in <listcomp>
    refs += [abcd_idx.index(label) for label in labels.tolist()]
ValueError: 29879 is not in list

I found that:

abcd_idx:  [319, 350, 315, 360]

labels tensor([  319, 29879,   350, 29879,   319, 29879,   315, 29879,   315, 29879,
          360, 29879,   350, 29879,   360, 29879], device='cuda:0')

Can anyone have an idea how to sort this out?

ghtaro avatar May 26 '23 15:05 ghtaro

same issue

update

Change model.config.pad_token_id to 0 should fix this problem but may harm to training.

        tokenizer.add_special_tokens(
            {
                "eos_token": tokenizer.convert_ids_to_tokens(model.config.eos_token_id),
                "bos_token": tokenizer.convert_ids_to_tokens(model.config.bos_token_id),
                "unk_token": tokenizer.convert_ids_to_tokens(0),
            }
        )

I wonder what exactly does this change provide?

atillabasaran avatar May 26 '23 17:05 atillabasaran

same issue

mofanv avatar May 27 '23 01:05 mofanv

@amdnsr,

Based on the error message you shared, it appears that there is an "OverflowError: out of range integral type conversion attempted" when converting token IDs during tokenization. To address this issue, we recommend the following solution:

  1. Update your code as follows:
from transformers import LlamaTokenizer

tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-13b-hf")
tokenizer.add_special_tokens({"pad_token": "[PAD]"})

# Rest of your code...

By using the LlamaTokenizer from the transformers library and adding the [PAD] token as a special token, you can resolve the "out of range integral type conversion" error.

Best regards, @hemangjoshi37a

hemangjoshi37a avatar May 28 '23 08:05 hemangjoshi37a

Setting pad_token_id in the model config worked for me.

For example for vicuna model.config.pad_token_id = tokenizer.eos_token_id

saxenarohit avatar Oct 06 '23 23:10 saxenarohit