binbin Deng

Results 3 comments of binbin Deng

Hi, to run neural-chat 7b inference using DeepSpeed AutoTP and our low-bit optimization, you could follow these steps: 1) Prepare your environment following [installation steps](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Deepspeed-AutoTP#1-install). Especially for neural-chat-7b model, you...

Device 0 and 1 are used by default in our script. Please refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html) for more details about how to select devices. According to my experiment on 2 A770,...