Kevin Tang issues

Results 4 issues of


                                            Kevin Tang

Add Intel AMX detection

deepspeed_optimize_model_gpu Qwen/Qwen-7B-Chat

ipex-llm + deepspeed run Qwen-7B-Chat with the following error: [0] RuntimeError: shape '[1, 1024, 16, 128]' is invalid for input of size 4194304 accelerate 0.29.2 mpi4py 3.1.6 bigdl-core-xe-21 2.5.0b20240411 bigdl-core-xe-esimd-21...

user issue

Benchmark latency different between oneAPI2024.0 and 2024.1

Platform: Ubuntu 22.04 with Arc A770 Model: Meta-Llama-3-8B-Instruct Config ① ![image](https://github.com/intel-analytics/ipex-llm/assets/89760643/50983b05-d04d-4fbb-bf28-ae5a2c5b40d6) source oneapi/2024.0 Config ② ![image](https://github.com/intel-analytics/ipex-llm/assets/89760643/a329927e-7241-4c64-978b-5732cf4ea9c7) source oneapi/2024.1 Config ① 0,meta-llama/Meta-Llama-3-8B-Instruct,476.44,**14.67**,0.0,1024-512,1,1024-512,1,sym_int4,True,118.7,4.849609375,N/A,N/A 1,meta-llama/Meta-Llama-3-8B-Instruct,1044.11,15.46,0.0,2048-512,1,2038-512,1,sym_int4,True,118.7,5.7109375,N/A,N/A Config ② ,meta-llama/Meta-Llama-3-8B-Instruct,455.34,**21.77**,0.0,1024-512,1,1024-512,1,sym_int4,,10.28,5.896484375 ,meta-llama/Meta-Llama-3-8B-Instruct,2656.5,23.43,0.0,2048-512,1,2038-512,1,sym_int4,,10.28,6.6171875 So please help double...

user issue

[RCR] 多模态Serving框架支持

Please help to impl internlm-xcomposer2-vl-7b serving support on lightweight serving or some other frameworks.

user issue