ESRGAN icon indicating copy to clipboard operation
ESRGAN copied to clipboard

auto_split_upscale doesn't work with CPU inference

Open Splendide-Imaginarius opened this issue 2 years ago • 4 comments

This code is pretty clearly assuming CUDA:

https://github.com/joeyballentine/ESRGAN/blob/b13baabcdac1ab098ae91debce7e1ce85bc48f7c/utils/dataops.py#L44-L55

When doing CPU inference (at least on most Linux systems), an out-of-memory error won't result in a CUDA exception, it will just result in a process (probably Python, but maybe some random other process on the system) being killed. On Windows, it's even worse; the entire system is likely to lock up and require a hard restart.

I see two possible approaches to fix it:

  1. Allow the user to choose to explicitly provide a tile size, as upstream Real-ESRGAN's inference code does.
  2. Use some heuristic to detect high (but not critical) RAM usage, e.g. checking the RAM size of the machine and comparing it to system RAM+swap usage, and ramp up the tile size until the RAM usage gets too high.

Option 2 seems very messy to me, and I suspect it will not yield optimal results.

Splendide-Imaginarius avatar Feb 02 '23 04:02 Splendide-Imaginarius

Tiling on CPU is useless as it is all done in-memory. An OOM on CPU means you don't have any free RAM left, and tiling at that point would just use more ram if anything.

The only way around that would be using a temp-file cache for CPU mode. So basically tile by saving bits to disk, and then patching them together after.

joeyballentine avatar Mar 22 '23 03:03 joeyballentine

Um. Are you sure about that? I've run into OOM issues on the upstream Real-ESRGAN repo when doing CPU inference with no tiling, and they were fixed by enabling tiling. The issue isn't that the images themselves are using up too much RAM, it's that the inference model is using too much RAM, which is reduced if the inference model deals with a smaller image. (I don't discount the chance that I'm confused about how this works, but this sounds wrong to me.)

Splendide-Imaginarius avatar Mar 23 '23 05:03 Splendide-Imaginarius

I've run into OOM issues on the upstream Real-ESRGAN repo when doing CPU inference with no tiling, and they were fixed by enabling tiling.

I would have to look at how they're doing tiling. It's definitely different from how we do it here if it's able to fix non-vram OOMs

joeyballentine avatar Mar 23 '23 05:03 joeyballentine

I would have to look at how they're doing tiling. It's definitely different from how we do it here if it's able to fix non-vram OOMs

Interesting. I'll see if I can investigate what might be different (though by all means feel free to look at it on your end too; you clearly have a better knowledge of the code involved).

Splendide-Imaginarius avatar Mar 23 '23 05:03 Splendide-Imaginarius