Zhao Changmin
Zhao Changmin
I believe this is an issue of IPEX, under this case which exists both dgpu and igpu, Ipex will always select the first device in the list as the executed...
LinuxOS 22 The loading failed issue has been fixed in the attached pr. > Also found that 2.5.0b20240213 rwkv model loading at runtime is much slower than 2.5.0b20240204 about 4...
Will fix in https://github.com/intel-analytics/BigDL/pull/10179
hi, I think the `VF.drop` is not implemented by our kernels, instead I suppose this error indicates that `input ` is in 8-bit data format which is not a supported...
Sorry that I can not reproduce this issue on qwen-7b-chat ``` (changmin-llm) arda@arda-arc13:~/changmin/llm.cpp$ pip install ipex-llm==2.1.0b20240521 Collecting ipex-llm==2.1.0b20240521 Using cached ipex_llm-2.1.0b20240521-py3-none-manylinux2010_x86_64.whl.metadata (5.0 kB) Using cached ipex_llm-2.1.0b20240521-py3-none-manylinux2010_x86_64.whl (13.8 MB) Installing collected...
fixed in https://github.com/intel-analytics/ipex-llm/pull/11110
hi, you may refer to https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/install_linux_gpu.html#optional-update-level-zero-on-intel-core-ultra-igpu to checkout your MTL linux machine status and https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/benchmark_quickstart.html#run-on-linux to run your code.
We need to do some adaptation work for this gptj quantified qwen-vl model https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl/chat.py. If your goal is to use the qwen-vl model on mtl linux, we recommend that you...
hi, I think we have fix this in latest pr, may you try ipex-llm[cpp] >=2.2.0b20240924 tomorrow?
Find out stocks here: https://github.com/pytorch/pytorch/blob/main/torch/nn/parallel/distributed.py#L809