Kevin Tang
Kevin Tang
ipex-llm + deepspeed run Qwen-7B-Chat with the following error: [0] RuntimeError: shape '[1, 1024, 16, 128]' is invalid for input of size 4194304 accelerate 0.29.2 mpi4py 3.1.6 bigdl-core-xe-21 2.5.0b20240411 bigdl-core-xe-esimd-21...
Platform: Ubuntu 22.04 with Arc A770 Model: Meta-Llama-3-8B-Instruct Config ①  source oneapi/2024.0 Config ②  source oneapi/2024.1 Config ① 0,meta-llama/Meta-Llama-3-8B-Instruct,476.44,**14.67**,0.0,1024-512,1,1024-512,1,sym_int4,True,118.7,4.849609375,N/A,N/A 1,meta-llama/Meta-Llama-3-8B-Instruct,1044.11,15.46,0.0,2048-512,1,2038-512,1,sym_int4,True,118.7,5.7109375,N/A,N/A Config ② ,meta-llama/Meta-Llama-3-8B-Instruct,455.34,**21.77**,0.0,1024-512,1,1024-512,1,sym_int4,,10.28,5.896484375 ,meta-llama/Meta-Llama-3-8B-Instruct,2656.5,23.43,0.0,2048-512,1,2038-512,1,sym_int4,,10.28,6.6171875 So please help double...
Please help to impl internlm-xcomposer2-vl-7b serving support on lightweight serving or some other frameworks.