intel-extension-for-pytorch
intel-extension-for-pytorch copied to clipboard
Under what setting, ipex.llm.optimize will be faster than torch.compile?
Describe the issue
I was testing on the MAX 1100 and 1550 GPUs using the conda environment set up as described in these instructions. I compared performance between torch.compile and ipex.llm.optimize on float16, and observed that several LLMs perform worse when using torch.compile.
Could you clarify the correct settings or scenarios where ipex.llm.optimize() is expected to outperform torch.compile?