peft
peft copied to clipboard
AttributeError: 'NoneType' object has no attribute 'device'
why this happening?
batch = tokenizer("Two things are infinite: ", return_tensors="pt")
with torch.cuda.amp.autocast():
output_tokens = model.generate(**batch, max_new_tokens=50)
print("\n\n", tokenizer.decode(output_tokens[0], skip_special_tokens=True))
its give the following error
AttributeError: 'NoneType' object has no attribute 'device'
Hi @imrankh46
Thanks for the issue! We are aware of the issue, for now the solution is to pass device_map={"":0}
when calling PeftModel.from_pretrained
, we will work on a proper fix soon
The issue is due to a call of dispatch_model
inside from_pretrained
of PeftModel
, that breaks few things with Linear8bitLt
layers
Hi @imrankh46 Thanks for the issue! We are aware of the issue, for now the solution is to pass
device_map={"":0}
when callingPeftModel.from_pretrained
, we will work on a proper fix soon The issue is due to a call ofdispatch_model
insidefrom_pretrained
ofPeftModel
, that breaks few things withLinear8bitLt
layers
Thanks for response.
@imrankh46 I believe https://github.com/huggingface/accelerate/pull/1237 should have fixed your issue, can you try to download accelerate
from source and let us know if you still face the issue?
pip install git+https://github.com/huggingface/accelerate
Hi @younesbelkada , I have tried to install accelerate
from source, but I got another Error:
NotImplementedError: Cannot copy out of meta tensor; no data!
Do you know the possible reason for this? Thanks!
Hi @imrankh46 Thanks for the issue! We are aware of the issue, for now the solution is to pass
device_map={"":0}
when callingPeftModel.from_pretrained
, we will work on a proper fix soon The issue is due to a call ofdispatch_model
insidefrom_pretrained
ofPeftModel
, that breaks few things withLinear8bitLt
layers
I have encountered the same problem, my version is peft=0.2.0. I wonder if you have resolved this issue?
@YSLLYW
Can you try by installing accelerate
from source?
pip install git+https://github.com/huggingface/accelerate
Hi @imrankh46 Thanks for the issue! We are aware of the issue, for now the solution is to pass
device_map={"":0}
when callingPeftModel.from_pretrained
, we will work on a proper fix soon The issue is due to a call ofdispatch_model
insidefrom_pretrained
ofPeftModel
, that breaks few things withLinear8bitLt
layersI have encountered the same problem, my version is peft=0.2.0. I wonder if you have resolved this issue?
You need to pass the device_map parameters like this device_map={"":0}
The complete code are here. You just need to pass your model
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import LlamaTokenizer
# pass your model name
peft_model_id = 'your_model_name'
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id, torch_dtype=torch.float16,device_map={"":0})
@YSLLYW Can you try by installing
accelerate
from source?pip install git+https://github.com/huggingface/accelerate
I already solved the issues. Thanks
@YSLLYW Can you try by installing
accelerate
from source?pip install git+https://github.com/huggingface/accelerate
Yes, I just updated the PEFT version to 0.3.0 and resolved this issue. Thank you for your reply
@YSLLYW Can you try by installing
accelerate
from source?pip install git+https://github.com/huggingface/accelerate
I already solved the issues. Thanks
Yes, I just updated the PEFT version to 0.3.0 and resolved this issue. Thank you for your reply
@YSLLYW Can you try by installing
accelerate
from source?pip install git+https://github.com/huggingface/accelerate
I already solved the issues. Thanks
Yes, I just updated the PEFT version to 0.3.0 and resolved this issue. Thank you for your reply
Please also add Contrastive Search
for text generation. When I pass penalty_alpha=0.6
they give me error cuda out memory
.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.