intel-extension-for-pytorch
intel-extension-for-pytorch copied to clipboard
Unsupported input type, fallback to the origin model
Describe the issue
I am trying to run the "meta-llama/llama-2-7b-chat-hf" with the llm-on-ray framework, however I am getting the following output:
(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/models/reference/modules/attentions.py:962: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
(ServeReplica:router:PredictorDeployment pid=3040030) + torch.tensor(combined_attention_mask)
(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/optimize.py:683: UserWarning: fail to apply optimize_transformers due to: Unsupported input type, fallback to the origin model
I think llama-2-7b-chat-hf should be supported for optimizations. These are the versions I am using:
- transformers 4.31
- torch 2.1.0
- ipex 2.1.0
Kindly assist in solving this issue. Thank you!
Hi @akarX23, I will reproduce the issue and get back to you.
Sure, thank you for your time and support @Vasud-ha
Hi @akarX23, could you update the version for transformers>=4.35.0. Refer from https://github.com/intel/llm-on-ray/tree/main
Hi @Vasud-ha , the reason I am using transformers 4.31 is because bigdl-all requires transformers version 4.31, this is the output of pip install ".[cpu,bigdl-cpu]":
134.0 The conflict is caused by:
134.0 llm-on-ray[bigdl-cpu,cpu] 0.0.1 depends on transformers>=4.35.0; extra == "cpu"
134.0 bigdl-llm[all] 2.5.0b20240222 depends on transformers==4.31.0; extra == "all"
I would have to use either ipex or bigdl one at a time
Can you suggest a version of bigdl-llm compatible with transformers 4.35 or 4.37?
Could you try installing bigdl-llm with this command pip install --pre --upgrade bigdl-llm[xpu_2.1] -f https://developer.intel.com/ipex-whl-stable-xpu with transformer 4.35 or newer?
@akarX23 need to update to torch/ipex to 2.2 and transformers to 4.35.2, check this: https://github.com/intel/llm-on-ray/pull/143
@Vasud-ha @xwu99, I will try these suggestions soon. Currently, I have shifted to OVMS with ITREX, the performance is pretty good. I will update you with what I find.
Hi @akarX23 are you still working on llm-on-ray, and does the issue persist? Thanks.
Hi @ZailiWang , currently the work with llm-on-ray has stopped. I have not checked in if the issue persists. We have customers who will be trying out llm-on-ray soon, if anything comes up I will raise the issue. Thank you for the help!