Yuxuan Xia comments

Repositories
Issues
Comments

Results 3 comments of


                                            Yuxuan Xia

Update Env check Script

This is the current env-check.sh result ![env-check2](https://github.com/intel-analytics/ipex-llm/assets/77518229/34510b69-2a58-4a44-aa7c-6b86f4d7b927) ![env-check1](https://github.com/intel-analytics/ipex-llm/assets/77518229/c853d078-d1f2-45e4-a6f1-b1c7060444c1)

the performance of Qwen-7b with 1k input is slower than 2k input, either for memory utilization

We cannot reproduce this issue. 2k input memory is always larger than 1k's. If we use Quantized KV Cache, the long sequence's second token latency might outperform the shorter sequence....

Full pre-trained model

I think the pretrained full model is provided in the repo but it is not that obvious. You can check this [link](https://drive.google.com/drive/folders/15wx9vOM0euyizq-M1uINgN0_wjVRf9J3)