intel-extension-for-pytorch icon indicating copy to clipboard operation
intel-extension-for-pytorch copied to clipboard

Unsupported input type, fallback to the origin model

Open akarX23 opened this issue 1 year ago • 8 comments

Describe the issue

I am trying to run the "meta-llama/llama-2-7b-chat-hf" with the llm-on-ray framework, however I am getting the following output:

(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/models/reference/modules/attentions.py:962: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
(ServeReplica:router:PredictorDeployment pid=3040030)   + torch.tensor(combined_attention_mask)
(ServeReplica:router:PredictorDeployment pid=3040030) /home/develop/.anaconda3/envs/llm-on-ray/lib/python3.9/site-packages/intel_extension_for_pytorch/transformers/optimize.py:683: UserWarning: fail to apply optimize_transformers due to: Unsupported input type, fallback to the origin model

I think llama-2-7b-chat-hf should be supported for optimizations. These are the versions I am using:

  • transformers 4.31
  • torch 2.1.0
  • ipex 2.1.0

Kindly assist in solving this issue. Thank you!

akarX23 avatar Feb 21 '24 03:02 akarX23

Hi @akarX23, I will reproduce the issue and get back to you.

Vasud-ha avatar Feb 23 '24 05:02 Vasud-ha

Sure, thank you for your time and support @Vasud-ha

akarX23 avatar Feb 23 '24 05:02 akarX23

Hi @akarX23, could you update the version for transformers>=4.35.0. Refer from https://github.com/intel/llm-on-ray/tree/main

Vasud-ha avatar Feb 23 '24 11:02 Vasud-ha

Hi @Vasud-ha , the reason I am using transformers 4.31 is because bigdl-all requires transformers version 4.31, this is the output of pip install ".[cpu,bigdl-cpu]":

134.0 The conflict is caused by:
134.0     llm-on-ray[bigdl-cpu,cpu] 0.0.1 depends on transformers>=4.35.0; extra == "cpu"
134.0     bigdl-llm[all] 2.5.0b20240222 depends on transformers==4.31.0; extra == "all"

I would have to use either ipex or bigdl one at a time

akarX23 avatar Feb 23 '24 11:02 akarX23

Can you suggest a version of bigdl-llm compatible with transformers 4.35 or 4.37?

akarX23 avatar Feb 23 '24 11:02 akarX23

Could you try installing bigdl-llm with this command pip install --pre --upgrade bigdl-llm[xpu_2.1] -f https://developer.intel.com/ipex-whl-stable-xpu with transformer 4.35 or newer?

Vasud-ha avatar Feb 26 '24 08:02 Vasud-ha

@akarX23 need to update to torch/ipex to 2.2 and transformers to 4.35.2, check this: https://github.com/intel/llm-on-ray/pull/143

xwu-intel avatar Mar 14 '24 05:03 xwu-intel

@Vasud-ha @xwu99, I will try these suggestions soon. Currently, I have shifted to OVMS with ITREX, the performance is pretty good. I will update you with what I find.

akarX23 avatar Mar 14 '24 05:03 akarX23

Hi @akarX23 are you still working on llm-on-ray, and does the issue persist? Thanks.

ZailiWang avatar Aug 22 '24 09:08 ZailiWang

Hi @ZailiWang , currently the work with llm-on-ray has stopped. I have not checked in if the issue persists. We have customers who will be trying out llm-on-ray soon, if anything comes up I will raise the issue. Thank you for the help!

akarX23 avatar Aug 22 '24 09:08 akarX23