FinGPT
FinGPT copied to clipboard
It did not work when I try to convert the default model "chatglm2" to "llama2"
Thanks for your awesome project. I reproduced the FinGPT v3.1.2 (4-bit QLoRA). It does work with the default LLM model "chatglm2" on Colab, but it comes to a halt when I wanna get better results with Llama2.
-
I have changed the model as per your instructions, modifying model_name = "THUDM/chatglm2-6b" to model_name = "daryl149/llama-2-7b-chat-hf"
-
Then removed the device due to running error:
model = AutoModel.from_pretrained(
model_name,
quantization_config=q_config,
trust_remote_code=True,
token = access_token,
# device='cuda'
)
-
Changed the target_modules to llama:
target_modules = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING['llama']
-
Unfortunately, the final step got a TypeError: 'NoneType' object cannot be interpreted as an integer
writer = SummaryWriter()
trainer = ModifiedTrainer(
model=model,
args=training_args, # Trainer args
train_dataset=dataset["train"], # Training set
eval_dataset=dataset["test"], # Testing set
data_collator=data_collator, # Data Collator
callbacks=[TensorBoardCallback(writer)],
)
trainer.train()
writer.close()
# save model
model.save_pretrained(training_args.output_dir)
The detail error as follows:
You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-d05cf508c134> in <cell line: 11>()
9 callbacks=[TensorBoardCallback(writer)],
10 )
---> 11 trainer.train()
12 writer.close()
13 # save model
6 frames
<ipython-input-25-26476d7038e4> in data_collator(features)
37 ids = ids + [tokenizer.pad_token_id] * (longest - ids_l)
38 _ids = torch.LongTensor(ids)
---> 39 labels_list.append(torch.LongTensor(labels))
40 input_ids.append(_ids)
41 input_ids = torch.stack(input_ids)
TypeError: 'NoneType' object cannot be interpreted as an integer
Could you please do me a favor resolving this issue? Looking forward to your reply! (Platform: A100 on Google Colab)
I am also facing the same issue. I am running on AWS g5.8xlarge Were you able to solve it?
Since LlamaTokenizer does not have value for tokenizer.pad_token_id, its value is None.
When this tokenizer is used to calculate the value of 'labels' it gives the error
TypeError: 'NoneType' object cannot be interpreted as an integer
I assigned the value 0 to tokenizer.pad_token_id and the above error was resolved.
But got another error Llama.forward() got an unexpected keyword argument 'labels'
To resolve this I replaced
model = AutoModel.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True )
with
model = AutoModelForCausalLM.from_pretrained( model_name, quantization_config=q_config, trust_remote_code=True, device_map = "auto" )
and it worked fine
TL;DR: add tokenizer.pad_token_id = 0
in your code
Main problem, that code in ChatGPT relies on pad_token_id, that meta-LLAMA model doesn't use. But if check a little closer into special_tokens_map.json
we can see next lines:
"pad_token": "<unk>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
So as pad_token
we can use id of unk_token
which is equal to 0
. To solve the problem with None, initialise field pad_token_id
in tokenizer with value 0
: tokenizer.pad_token_id = 0