openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[Bug]: clEnqueueNDRangeKernel, error code: -54 when trying to run notebook 278 on GPU

Open clinty opened this issue 1 year ago • 10 comments

OpenVINO Version

2024.0.0

Operating System

Other (Please specify in description)

Device used for inference

CPU

Framework

PyTorch

Model used

278-stable-diffusion-ip-adapter

Issue description

After selecting GPU inference in the 278-stable-diffusion-ip-adapter notebook, the second generation image variation cell fails with

RuntimeError: Exception from src/inference/src/cpp/infer_request.cpp:223:
Exception from src/plugins/intel_gpu/src/runtime/ocl/ocl_stream.cpp:310:
[GPU] clEnqueueNDRangeKernel, error code: -54

Step-by-step reproduction

Run the cells of the 278-stable-diffusion-ip-adapter openvino notebook, select GPU, and continue to run cells.

Relevant log output

No response

Issue submission checklist

  • [X] I'm reporting an issue. It's not a question.
  • [X] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • [X] There is reproducer code and related data files such as images, videos, models, etc.

clinty avatar Mar 13 '24 19:03 clinty

May I know which Operating System are you using when running the Image Generation with Stable Diffusion and IP-Adapter? Which Python version are you using on your machine? Does error occurs when running the inference with CPU plugin?

Wan-Intel avatar Mar 22 '24 05:03 Wan-Intel

@Wan-Intel , this is on Debian testing. Python is 3.11.8. I have been unable to run it on CPU as the process runs out of memory and gets killed by the OOM killer.

clinty avatar Mar 22 '24 13:03 clinty

I've validated the Image Generation with Stable Diffusion and IP-Adapter and the result is attached below: ok

May I know which CPU are you using to run the Image Generation with Stable Diffusion and IP-Adapter?

For your information, the supported CPU processor can be checked at OpenVINO™ System Requirements, and the supported Operating System to run OpenVINO™ Notebooks can be checked at OpenVINO™ Notebook System Requirements.

Wan-Intel avatar Mar 25 '24 03:03 Wan-Intel

@Wan-Intel , I have now tried using Ubuntu 22.04 LTS (64 bit) on an i5-8365U and on a i7-10510U. On both machines the errors are the same as I get on Debian. With CPU Python runs out of memory and dies. With GPU I get clEnqueueNDRangeKernel, error code: -54.

clinty avatar Mar 26 '24 01:03 clinty

Intel® Core™ i5-8365U Processor and Intel® Core™ i7-10510U Processor are supported for using OpenVINO™.

Could you please re-install the latest OpenVINO™ Notebook with the installation guide from here? Referring to this StackOverflow thread, please check and reduce local memory size, local group size, constant memory, and kernel arguments size.

Wan-Intel avatar Mar 26 '24 07:03 Wan-Intel

Hello @Wan-Intel , I have re-installed the latest OpenVINO™ Notebook. How do I reduce local memory size, local group size, constant memory, and kernel arguments size?

clinty avatar Mar 26 '24 18:03 clinty

I'm able to run the inference successfully when using CPU plugin.

When I select GPU plugin as inference device and run the following lines:

generator = torch.Generator(device="cpu").manual_seed(576)

image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png")

result = ov_pipe(prompt='', ip_adapter_image=image, gaidance_scale=1, negative_prompt="", num_inference_steps=4, generator=generator)

fig = visualize_results([image, result.images[0]], ["input image", "result"])

I encountered the following errors: RuntimeError: [GPU] Exceed max size of memory allocation: Required 68161536 bytes, already occupied: 7603245268 bytes, but available memory size is 7616372736 bytes.

Let me check with relevant team and we'll update you as soon as possible.

Wan-Intel avatar Mar 27 '24 02:03 Wan-Intel

I also get the -54 error with the instant-id notebook.

clinty avatar Apr 11 '24 20:04 clinty

@clinty some demonstrated models can require at least 32GB RAM for conversion and running as stated in the notebook's description, this applies to both notebooks (278-stable-diffusion-ip-adapter and 286-instant-id) you have tried.

So the error may be caused by lack of memory and the process running out of memory. How much RAM is found on your system? Make sure there is enough RAM available, otherwise the process will be killed by the Linux kernel which is expected.

avitial avatar May 07 '24 19:05 avitial

@avitial the system only has 32GB RAM. Could the error message be improved?

clinty avatar May 07 '24 20:05 clinty

Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen and ask additional questions related to this topic.

avitial avatar Aug 09 '24 19:08 avitial