Ze Han

Results 13 comments of Ze Han

请问loss=0,解决了吗,我把llama模型的梯度冻结,只微调lora,训练正常,可是当把llama的embed层的梯度设置为True,然后同时训练embd与lora,loss只有第一个有值,然后一直是0.

代码:import torch model = LlamaForCausalLM.from_pretrained('decapoda-research/llama-7b-hf',device_map='auto',cache_dir='./cache/',load_in_8bit = True) model.resize_token_embeddings(len(merge_tokenizer)) from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling trainArgs = TrainingArguments( output_dir= '../ckps', do_train=True, per_device_train_batch_size=4, gradient_accumulation_steps=4, evaluation_strategy="steps", save_strategy="steps", save_steps=1000, eval_steps=100, logging_steps=1, warmup_steps=100, num_train_epochs=2, learning_rate=3e-4,...

感谢回复,是在int8下。当我把int8关掉,然后通过peft lora 微调的时候,报错提示tensor 在gpu和cpu上,打印模型权重的device后,发现lora的权重全在cpu上,其余代码与上面的我发的一致,请问这怎么解决呢?

新的peft 发布了https://github.com/huggingface/peft/releases/tag/v0.3.0

Hello, have you solved it? I have the same problem as you

请问解决了吗,我也遇到了想用的问题

您好,感谢回复,我发现redpajama的base model 是GPT neox,但是hugging face官网上面的是fast版本,于是我像用个gpt neo,它的tokenizer 是基于gpt2的,当时尝试gpt2tokenizer时,red_tokenizer = GPT2Tokenizer.from_pretrained('EleutherAI/gpt-neo-1.3B',cache_dir='./cache/')依旧是这个报错: ![Uploading image.png…]()