ZhangWei125521
ZhangWei125521
Now my PC contains both Nvidia dGPU and Intel iGPU, how can I switch GPU freely. becaue the inference always on the Nvidia dGPU, I have no idea to select...
**I can run Qwen2.5 omni with pip install git+https://github.com/huggingface/transformers@f742a644ca32e65758c3adb36225aef1731bd2a8, and I can run it successful. But I want to use the multi batch function, and the https://github.com/huggingface/transformers@f742a644ca32e65758c3adb36225aef1731bd2a8 does not contain...
I hope the output as token by token during the inference rather than the whole output after the inference finished, if ipex-llm support or not?