intel-extension-for-pytorch
intel-extension-for-pytorch copied to clipboard
return_dict_in_generate not working for model.generate after ipex.llm.optimize
Describe the bug
model_name = "meta-llama/Meta-Llama-3.1-8B"
dtype = "bfloat16"
amp_enabled = True if dtype != "float32" else False
amp_dtype = getattr(torch, dtype)
config = AutoConfig.from_pretrained(model_name, torchscript=True)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=amp_dtype, config=config, low_cpu_mem_usage=True, trust_remote_code=True)
model = model.to(memory_format=torch.channels_last)
model = model.eval()
model_org = model
model = ipex.llm.optimize(model, dtype=amp_dtype, inplace=False, deployment_mode=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
model.generate(input_ids, use_cache=True, return_dict_in_generate=True)
model_org.generate(input_ids, use_cache=True, return_dict_in_generate=True)
model.generate will return only a tensor while model_org.generate will return a dict.
Is there any explanation or solution for that? thanks.
Versions
2.4.0+cpu
@YYue000 Thanks for reporting this issue. We will look into it and give feedback later.
@YYue000 I can reproduce this issue. Using 2.4.0+cpu,
a) Without ipex.llm.optimize(), invoking model.generate() with return_dict_in_generate=True would return an object type of transformers.generation.utils.GenerateDecoderOnlyOutput where attributes like sequences can be retrieved.
b) With ipex.llm.optimize(), model.generate() would always return a tensor object.
This issue has been fixed by our dev team. The fix will be included in the upcoming ipex 2.5 release.
@YYue000 I can reproduce this issue. Using
2.4.0+cpu, a) Withoutipex.llm.optimize(), invokingmodel.generate()withreturn_dict_in_generate=Truewould return an object type oftransformers.generation.utils.GenerateDecoderOnlyOutputwhere attributes likesequencescan be retrieved. b) Withipex.llm.optimize(),model.generate()would always return a tensor object.This issue has been fixed by our dev team. The fix will be included in the upcoming ipex 2.5 release.
@YYue000 IPEX v2.5.0+cpu has been released yesterday. The return_dict_in_generate issue is gone with this commit https://github.com/intel/intel-extension-for-pytorch/commit/584a4e2e2c6193b926554f951d2608489cac5d7a. Please help verify on your side.