Kernel XXX removed due to usage of FP64 instructions despite no FP64 weights/data
When using v1.10.200+gpu with my Arc A770 I get errors stating:
[CRITICAL ERROR] Kernel [XXX] removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
The release notes mention this error:
FP64 is not natively supported by the Intel® Data Center GPU Flex Series platform. If you run any AI workload on that platform and receive this error message, it means a kernel requiring FP64 instructions is removed and not executed, hence the accuracy of the whole workload is wrong.
My confusion is that I don't belive I've ever explicitly selected a kernel, nor do I know how to change it. I've also verified that my model and data are both using torch.Float32 as their dtype.
I am using the "optimize" method. Newer versions have the "auto_kernel_selection" parameter, but v1.10.200 doesn't seem to have any parameters that can impact the kernel. Removing the call to optimize seems to remove the error.
An simple example script along with my environment (conda & env variables) and logs can be found here.
For now I'll just remove the call to optimize, but eventually I would like to take advantage of what it has to offer. Is there anything I'm overlooking?
Mentioned here as well: https://github.com/intel/intel-extension-for-pytorch/issues/257 You can disable profiling as a workaround. Fixes most of these issues but probably also prevents the runtime optimisations.
From what I understand, a part of the BatchNorm2d contains a fp64 value, which triggers this. It will be fixed in the next version.
@yohnk Just tried running your script and I don't face this issue on IPEX 1.13. Would be great if you could test on this newer version and report. Revisit the install instructions here https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/installation.html
@yohnk Just tried running your script and I don't face this issue on IPEX 1.13. Would be great if you could test on this newer version and report. Revisit the install instructions here https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/installation.html
Can confirm as well with my Tiger Lake Intel iGPU setup, but this time with an even newer version of IPEX built from source (2.0.110):
(ipex_env) minbuntu@minbuntu:~/intel_pytorch_workspace/final_bench_scripts$ python3 yohnk_intel-kernel-error_main.py
No CUDA runtime is found, using CUDA_HOME='/usr'
Linear Layer dtype: torch.float32
Can not find transformers in your environment, disable ipex transformer optimize
X dtype: torch.float32
Y dtype: torch.float32
Complete
The Can not find transformers in your environment, disable ipex transformer optimize message is also mentioned here but appears to be harmless for now :eyes:
Please also see this.