alexsin368 comments

Results 23 comments of


                                            alexsin368

Performance Regression?

@chsasank I installed IPEX 2.1.40+xpu with Python 3.11.9, same as you, but am only able to reproduce 1 out of the 2 issues you see. I'm getting 16.72 tflops and...

Performance Regression?

@chsasank The performance regression needs to be within the scope of IPEX itself for my team and I to continue debugging. Let's figure out whether the regression is indeed to...

DataParallel is Supported for XPU?

Hi @yash3056 please describe your issue in detail and provide the code and steps to reproduce it.

When using xpu backend, stable diffusion give bad result (comparing with 'cpu' backend and also opevino)

@LeptonWu this issue could be related to https://github.com/intel/intel-extension-for-pytorch/issues/529 and my team members are looking into it.

How to enable support for AWQ ?

@Pradeepa99 The release notes mention more support for AWQ format support and it seems it is referring to the usage of ipex.llm.optimize where you can specify the quant_method as 'gptq'...

How to enable support for AWQ ?

@Pradeepa99 yes, the testcase example you found is what I meant. IPEX does not have an example similar to the GPTQ one you found. We recommend you to use Intel...

Error: cannot run benchmark for chatglm2 6b (The same script works for llama2 7b)

@andyluo7 I will work on reproducing this issue and get back to you with findings. Did you try passing in THUDM/chatglm2-6b directly as the model?

Error: cannot run benchmark for chatglm2 6b (The same script works for llama2 7b)

Issue reproduced. What version of transformers are you using? I have 4.37.0. I will be working with the team to resolve your issue.

Error: cannot run benchmark for chatglm2 6b (The same script works for llama2 7b)

@andyluo7 I found out what's causing the issue. It's when you pass in --token-latency as an input argument. Take a look at lines 211 and 215: https://github.com/intel/intel-extension-for-pytorch/blob/main/examples/cpu/inference/python/llm/single_instance/run_generation.py#L211-L215 For now, try...

Error: cannot run benchmark for chatglm2 6b (The same script works for llama2 7b)

@andyluo7 As a workaround for now, modify line 211 in run_generation.py script to _gen_ids = output_ for now and it should work.