binbin Deng
Results
3
comments of
binbin Deng
Hi, to run neural-chat 7b inference using DeepSpeed AutoTP and our low-bit optimization, you could follow these steps: 1) Prepare your environment following [installation steps](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Deepspeed-AutoTP#1-install). Especially for neural-chat-7b model, you...
Device 0 and 1 are used by default in our script. Please refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html) for more details about how to select devices. According to my experiment on 2 A770,...