intel-ravig
intel-ravig
@simonlui - I took over this ticket recently and have been able to reproduce this issue. After debugging and tracing a little bit, I suspect the issue is originating from...
Thanks @simonlui . I do understand that newer version does not work, but both PyTorch and IPEX both are upgraded. Some comments here - a. Have you verified the same...
@simonlui - This task is on priority list on our engineering team side and they are actively working on it. However, it will likely not get fixed in the upcoming...
@simonlui - I received word from the engineering team that a public commit addressing the issue has been merged. https://github.com/intel/intel-extension-for-pytorch/commit/7ea2a3c8b756dce20e50a5ac5b64a7d5d28381d7 I have given a test on our internal branch and...
@simonlui - Okay great! Thanks for confirming. If you are satisfied with the fix, please close this issue. If you would like us to take a look at the 'deepspeed'...
@mudler - To start with, I would look into the drivers - specifically the UMD. I point to UMD because the intel-published docker container is picking up devices well. Since...
@mudler - I am able to duplicate your issue on your conda environment. However, it runs fine on python venv and docker environments. I will discuss with engineering team for...
@mudler - I got in touch with the engineering team and got a solution. The conda environment issue is known and documented here. > https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/performance_tuning/known_issues.html ``` Problem: Number of dpcpp...
@eduand-alvarez Hi, I have submitted internal PR for fixing the typo. However, we will not be able to remove the "#noqa F401" at this time - as they are needed...
Internal PR merged - updates will be refreshed soon.