pyclesperanto_prototype icon indicating copy to clipboard operation
pyclesperanto_prototype copied to clipboard

median_sphere filter fails on large images

Open sebherbert opened this issue 2 years ago • 6 comments

Hi all,

@spherelife and I are having issues running the pyclesperanto_prototype.median_sphere filter. the same kernel size (6,6,2) works on a small image but not on a larger image. Error: RuntimeError: clEnqueueReadBuffer failed: OUT_OF_RESOURCES

We can run a smaller kernel on the large image (2,2,2) for example, but not the 6,6,2 kernel. Surprisingly, after failing in the large image, it also stops running on the small image either afterwards (same error) until we restart the kernel.

I was not expecting that the image size would play a large role in the processing but maybe I'm wrong and misunderstood something?

We are using a VM with an W10 machine and a shared NVIDIA RTXA6000-12Q.

I attach

  • the datasets example https://drive.switch.ch/index.php/s/4Vmha5Ioy5OQtHY
  • the log, script to recreate the error, environment recipe as yml and environment manual setup command lines. https://drive.switch.ch/index.php/s/7yBtjuxoOqg1rzZ

Let me know if something else could be of use for you.

Thanks!

sebherbert avatar Jul 28 '23 10:07 sebherbert

Hi @sebherbert ,

how large is the large image?

Best, Robert

haesleinhuepf avatar Jul 28 '23 14:07 haesleinhuepf

Hi @haesleinhuepf

The small image we tested is 378x363x77 in xyz and ~82 MB. The large image is 1536x1536x134 and ~2.4 GB.

Best, Fei

spherelife avatar Jul 31 '23 07:07 spherelife

Hi Robert,

Thanks for the fast follow-up!

As @spherelife was saying the "large" image 1536x1536x134 (16bits if I recall correctly) so nothing completely crazy :) I guess we could try with intermediate image size if it makes sense?

Best, Sebastien

sebherbert avatar Aug 04 '23 14:08 sebherbert

Hi @sebherbert and @spherelife ,

if you work on a Windows machine, can you try the solution proposed here and extend the kernel timeout in the registry?

Let me know if this helps!

Best, Robert

haesleinhuepf avatar Aug 26 '23 06:08 haesleinhuepf

Hi @haesleinhuepf,

Thanks for the reply and sorry it took us a while to come back to you, In the meantime @spherelife have tested the same image on a more powerful workstation (A100 card, Linux based). He reported that it ran smoothly (and he even tested with a 6x6x6 kernel that also passed without complaining despite being larger than 1000 voxels) so we'll keep this solution for the moment. So I guess this was either an allocation speed issue or limited vRAM issue since the larger card is working?

Thanks again for the support,

Best, Sebastien

sebherbert avatar Sep 08 '23 07:09 sebherbert

I just wanted to comment in cased anyone else comes across this issue and is looking for assistance besides 'better GPU'. In fact, I came across this issue because processing some images was working on an RTX 3060 12GB, but not on a Quadro RTX6000 (24GB) workstation GPU. The workstation GPU was having an CL_INVALID_COMMAND_QUEUE error on slightly larger images, but was handling images 1/4 the size ok. Either way, the images are at least 20 times smaller than VRAM and all worked on the 3060.

Anyways, I added to the registry (previously no key existed).

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers]
"TdrDelay"=dword:0000003c
"TdrDdiDelay"=dword:0000003c

in Powershell with

New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\GraphicsDrivers" -Name TdrDelay -PropertyType DWord -Value 60 -Force
New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\GraphicsDrivers" -Name TdrDdiDelay -PropertyType DWord -Value 60 -Force

and am now getting no error on the workstation GPU. I have had other workflows in the past with much larger images that also have CL_MEM_OBJECT_ALLOCATION_FAILURE and will report back if it also helps.

TimMonko avatar Aug 29 '24 17:08 TimMonko