intel-extension-for-pytorch VIDEO_SCHEDULER_INTERNAL

VIDEO_SCHEDULER_INTERNAL_ERROR

Open pmusser opened this issue 1 year ago • 6 comments

Describe the bug

After following the steps included in the blog post that came out a few days ago, I modified the code to try and interact with the google-gemma-7b model in a Jupyter notebook. Code as follows:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

############# code changes ###############
# import ipex
import intel_extension_for_pytorch as ipex
# verify Intel Arc GPU
print(ipex.xpu.get_device_name(0))
##########################################

# load model
model_id = "google/gemma-7b"
dtype = torch.float16

tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b")

############# code changes ###############
# move to Intel Arc GPU
model = model.eval().to("xpu")
##########################################

# generate 
with torch.inference_mode(), torch.no_grad(), torch.autocast(
        ############# code changes ###############
        device_type="xpu",
        ##########################################
        enabled=True,
        dtype=dtype
    ):
    text = "You may have heard of Schrodinger cat mentioned in a thought experiment in quantum physics. Briefly, according to the Copenhagen interpretation of quantum mechanics, the cat in a sealed box is simultaneously alive and dead until we open the box and observe the cat. The macrostate of cat (either alive or dead) is determined at the moment we observe the cat."
    input_ids = tokenizer(text, return_tensors="pt").input_ids
    ############# code changes ###############
    # move to Intel Arc GPU
    input_ids = input_ids.to("xpu")
    ##########################################
    generated_ids = model.generate(input_ids, max_new_tokens=128)[0]
    generated_text = tokenizer.decode(generated_ids, skip_special_tokens=True)

print(generated_text)

The code started executing successfully, but after the the model information got transferred to the GPU my computer started having some artifacts (chrome windows blanking and resizing), and shortly after the computer went to a BSOD of VIDEO_SCHEDULER_INTERNAL_ERROR, as follows:

The computer has rebooted from a bugcheck. The bugcheck was: 0x00000119 (0x0000000000000005, 0xffffe30e54c27000, 0xffffe30e5468a030, 0x0000000000050ec1). A dump was saved in: C:\WINDOWS\MEMORY.DMP. Report Id: 96e4c860-9dc4-49b8-a14f-e02f85d20f5e.

Versions

PyTorch version: 2.1.0a0+cxx11.abi PyTorch CXX11 ABI: No IPEX version: 2.1.10+xpu IPEX commit: a12f9f650 Build type: Release

OS: Microsoft Windows 11 Pro GCC version: N/A Clang version: N/A IGC version: 2024.0.2 (2024.0.2.20231213) CMake version: version 3.28.0-msvc1 Libc version: N/A

Python version: 3.9.18 (main, Sep 11 2023, 14:09:26) [MSC v.1916 64 bit (AMD64)] (64-bit runtime) Python platform: Windows-10-10.0.22631-SP0 Is XPU available: True DPCPP runtime version: N/A MKL version: N/A GPU models and configuration: [0] _DeviceProperties(name='Intel(R) Arc(TM) A770 Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=0, total_memory=15930MB, max_compute_units=512, gpu_eu_count=512) Intel OpenCL ICD version: N/A Level Zero version: N/A

CPU: Architecture=9 CurrentClockSpeed=3600 DeviceID=CPU0 Family=107 L2CacheSize=4096 L2CacheSpeed= Manufacturer=AuthenticAMD MaxClockSpeed=3600 Name=AMD Ryzen 7 3700X 8-Core Processor ProcessorType=3 Revision=28928

Versions of relevant libraries: [pip3] intel-extension-for-pytorch==2.1.10+xpu [pip3] numpy==1.26.4 [pip3] torch==2.1.0a0+cxx11.abi [conda] intel-extension-for-pytorch 2.1.10+xpu pypi_0 pypi [conda] numpy 1.26.4 pypi_0 pypi [conda] torch 2.1.0a0+cxx11.abi pypi_0 pypi

Feb 27 '24 23:02 pmusser

intel-extension-for-pytorch intel-extension-for-pytorch copied to clipboard

VIDEO_SCHEDULER_INTERNAL_ERROR

Describe the bug

Versions

intel-extension-for-pytorch
intel-extension-for-pytorch copied to clipboard