binbin Deng comments

Results 28 comments of


                                            binbin Deng

failure load the Qwen2-72B-Instruct with FP6 on 4 ARC GPU

Hi, @oldmikeyang, the error is caused by out of GPU memory. We haven't experimented 72B & fp6 through deepspeed autotp on 4 ARC, and please try vllm tp and pipeline...

Result is wrong when running Qwen2-1.5B-Instruct on Intel NPU

`Qwen2-7B-Instruct` is verified and we will try to reproduce your error first. We will inform you immediately once there is progress.

Result is wrong when running Qwen2-1.5B-Instruct on Intel NPU

Hi, @grandxin , I could not reproduce such error on MTL with `32.0.100.2540` driver. By using `ipex-llm==2.1.0b20240814`, the output of `Qwen2-1.5B-Instruct` with `load_low_bit=sym_int4` is ```bash -------------------- Output -------------------- system You...

how to get llama-3.2-11B-vision-instruct to work ?

Hi, @wallacezq , please try the latest ipex-llm (`2.2.0b20241008`) and related example (https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llama3.2-vision)

Getting PI_ERROR_OUT_OF_RESOURCES -- resolved

Hi, memory size of one Flex 170 is only 16GB, therefore, it's quite reasonable that you encountered `PI_ERROR_OUT_OF_RESOURCES` (which means out of memory) when tried to use two Flex170 to...

Getting PI_ERROR_OUT_OF_RESOURCES -- resolved

> Is the bad termination an expected behavior for now? ![image](https://private-user-images.githubusercontent.com/73611412/318525947-c47becd0-b8e6-4077-9fe6-73eff70df183.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTIwMzAyOTAsIm5iZiI6MTcxMjAyOTk5MCwicGF0aCI6Ii83MzYxMTQxMi8zMTg1MjU5NDctYzQ3YmVjZDAtYjhlNi00MDc3LTlmZTYtNzNlZmY3MGRmMTgzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA0MDIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNDAyVDAzNTMxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTIyMDk5MGIwOTQ4NTVjZGExZDAxYzZjOTM3MzhkZGE3OGUxZGZiMDI5MzAzZDliNTFkMzc0ZTdiNjY0OWIzNjMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.GetmWr6LyelNgsZAJ_9e6MYcYfcIpdBUfBUOa9kDiKM) Yes, please refer to here for details about [the known issue](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Deepspeed-AutoTP#known-issue)

non-singleton dimension errors when run Deepspeed-AutoTP

Hi, @jianweimama , we will inform you immediately once the bug is fixed.

non-singleton dimension errors when run Deepspeed-AutoTP

Hi, @jianweimama , this bug is fixed and you could try the new nightly version (later than 2.1.0b20240625) of ipex-llm.