FinGPT icon indicating copy to clipboard operation
FinGPT copied to clipboard

It did not work when I try to convert the default model "chatglm2" to "llama2"

Open maywind23 opened this issue 1 year ago • 3 comments

Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.

  • I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"

  • Then removed the device due to running error:

model = AutoModel.from_pretrained(
        model_name,
        quantization_config=q_config,
        trust_remote_code=True,
        token = access_token,
        # device='cuda'
    )
  • Changed the target_modules to llama: target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']

  • Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer

writer = SummaryWriter()
trainer = ModifiedTrainer(
    model=model,
    args=training_args,             # Trainer args
    train_dataset=dataset["train"], # Training set
    eval_dataset=dataset["test"],   # Testing set
    data_collator=data_collator,    # Data Collator
    callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)

The detail error as follows:

You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
      9     callbacks=[TensorBoardCallback(writer)],
     10 )
---> 11 trainer.train()
     12 writer.close()
     13 # save model

6 frames
<ipython-input-25-26476d7038e4> in data_collator(features)
     37         ids = ids + [tokenizer.pad_token_id] * (longest - ids_l)
     38         _ids = torch.LongTensor(ids)
---> 39         labels_list.append(torch.LongTensor(labels))
     40         input_ids.append(_ids)
     41     input_ids = torch.stack(input_ids)

TypeError: 'NoneType' object cannot be interpreted as an integer

Could you please do me a favor resolving this issue? Looking forward to your reply! (Platform: A100 on Google Colab)

maywind23 avatar Aug 29 '23 03:08 maywind23

I am also facing the same issue. I am running on AWS g5.8xlarge Were you able to solve it?

rajendrac3 avatar Sep 15 '23 07:09 rajendrac3

Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None. When this tokenizer is used to calculate the value of 'labels' it gives the error TypeError: 'NoneType' object cannot be interpreted as an integer

I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved. But got another error Llama.forward() got an unexpected keyword argument 'labels'

To resolve this I replaced

model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )

with model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )

and it worked fine

rajendrac3 avatar Sep 18 '23 12:09 rajendrac3

TL;DR: add tokenizer.pad_token_id = 0 in your code

Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json we can see next lines:

  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }

So as pad_token we can use id of unk_token which is equal to 0. To solve the problem with None, initialise field pad_token_id in tokenizer with value 0: tokenizer.pad_token_id = 0

IshchenkoRoman avatar Feb 23 '24 12:02 IshchenkoRoman