binbin Deng
binbin Deng
Hi, @oldmikeyang, the error is caused by out of GPU memory. We haven't experimented 72B & fp6 through deepspeed autotp on 4 ARC, and please try vllm tp and pipeline...
`Qwen2-7B-Instruct` is verified and we will try to reproduce your error first. We will inform you immediately once there is progress.
Hi, @grandxin , I could not reproduce such error on MTL with `32.0.100.2540` driver. By using `ipex-llm==2.1.0b20240814`, the output of `Qwen2-1.5B-Instruct` with `load_low_bit=sym_int4` is ```bash -------------------- Output -------------------- system You...
Hi, @wallacezq , please try the latest ipex-llm (`2.2.0b20241008`) and related example (https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llama3.2-vision)
Hi, memory size of one Flex 170 is only 16GB, therefore, it's quite reasonable that you encountered `PI_ERROR_OUT_OF_RESOURCES` (which means out of memory) when tried to use two Flex170 to...
> Is the bad termination an expected behavior for now?  Yes, please refer to here for details about [the known issue](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Deepspeed-AutoTP#known-issue)
Hi, @jianweimama , we will inform you immediately once the bug is fixed.
Hi, @jianweimama , this bug is fixed and you could try the new nightly version (later than 2.1.0b20240625) of ipex-llm.