intel-extension-for-pytorch icon indicating copy to clipboard operation
intel-extension-for-pytorch copied to clipboard

Could not running simple example on XPU

Open harborn opened this issue 1 year ago • 17 comments

Describe the issue

Hi, I have following codes:

import dpctl
import torch
import intel_extension_for_pytorch

xpu_num = len(dpctl.get_devices(backend="level_zero", device_type="gpu"))
print(f"xpu_num = {xpu_num}")

# device = torch.device("xpu:0")
device = torch.device("cpu:0")
print(f"device = {device}")
l = torch.nn.Linear(5, 5).to(device, torch.float)
print(f"type(l) = {type(l)}")
print(f"l = {l}")
i = torch.randn(5, 5, dtype=torch.float, device=device)
print(f"type(i) = {type(i)}")
print(f"i = {i}")
expected_out = l(i)
print(f"expected_out = {expected_out}")

When I enable device = torch.device("cpu:0") Runing this script, and result is:

xpu_num = 6
device = cpu:0
type(l) = <class 'torch.nn.modules.linear.Linear'>
l = Linear(in_features=5, out_features=5, bias=True)
type(i) = <class 'torch.Tensor'>
i = tensor([[ 0.1940,  0.6313,  0.7539,  0.8270,  0.4829],
        [-0.6601, -0.1555, -0.5951,  0.0552, -0.8904],
        [-0.2962,  0.2588,  0.9915,  0.1151,  1.3780],
        [ 0.7729,  1.1673,  0.0569, -0.5209, -1.3297],
        [-1.8744,  0.5630,  0.2379,  1.7550, -0.5790]])
expected_out = tensor([[ 0.1335, -0.6444, -0.4030, -0.0073,  0.2138],
        [ 0.0510,  0.1734, -0.4978, -0.9275, -0.4016],
        [ 0.1532, -1.0520,  0.1228,  0.3162,  0.1983],
        [ 0.7138,  0.0412, -1.0125, -0.7827, -0.6498],
        [-0.8450, -0.9822, -1.1371, -0.7442,  0.7524]],
       grad_fn=<AddmmBackward0>)

The results show this scripts codes running OK.

While when I enable device = torch.device("xpu:0"), result is:

xpu_num = 6
device = xpu:0
type(l) = <class 'torch.nn.modules.linear.Linear'>
l = Linear(in_features=5, out_features=5, bias=True)
type(i) = <class 'torch.Tensor'>
i = tensor([[ 0.0438,  0.4119,  0.4430, -0.9481, -0.0850],
        [-3.3893,  0.0693,  1.9792, -1.4143,  1.8235],
        [-0.2711,  0.6538, -1.3239,  1.4839, -0.5393],
        [ 1.4996, -0.5569, -0.0470,  1.7764,  0.5166],
        [ 1.7779,  0.4889,  0.6964,  2.2312,  0.2050]], device='xpu:0')
Traceback (most recent call last):
  File "/mnt/scratch/wugangsh/test_codes/python/test_ray.py", line 19, in <module>
    expected_out = l(i)
  File "/mnt/scratch/wugangsh/miniconda3/envs/ray_xpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/scratch/wugangsh/miniconda3/envs/ray_xpu/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: could not create an engine

My OS is:

> lsb_release -a
LSB Version:    n/a
Distributor ID: SUSE
Description:    SUSE Linux Enterprise Server 15 SP3
Release:        15.3
Codename:       n/a

harborn avatar Jun 05 '23 09:06 harborn