GLEN BERTULFO
GLEN BERTULFO
Hi Yang, is there a work around to get 6 or 8 GPUs working?
Hi Yang, going back to 8 GPUs on Flex with 32 attention head number, I reran on the same platform and verified this info when i did print(model) -- 32...
Hi Yang, please see attached output text file-- 8GPUs_llama2_7B.txt [8GPUs_llama2_7B.txt](https://github.com/intel-analytics/ipex-llm/files/14910582/8GPUs_llama2_7B.txt)
Hi Yang, updating. The CPU memory does hit max utilization when running 8GPUs with Vicuna 33B model on the DUT--  Please see attached vicuna 33B full txt output log...
Hi Yang, from our debug synch you indicated that on the same machine your fellow team member were not seeing issues on 8-GPU config. May I kindly ask for the...
Issue is resolved. Closing this ticket. Thank you team for your help.