sbb_binarization icon indicating copy to clipboard operation
sbb_binarization copied to clipboard

use predict_generator to better utilize GPU

Open bertsky opened this issue 4 years ago • 3 comments
trafficstars

When the model is applied in patch mode (the default), a loop over the windows is run (on CPU / in Numpy) and passed to model.predict() as a single image each (on GPU / in Keras).

https://github.com/qurator-spk/sbb_binarization/blob/8dd05064b2dbdc7d4bdfb8896251302e8ec5ecb3/sbb_binarize/sbb_binarize.py#L152

This does not utilize the GPU for two reasons:

  1. the effective batch size of 1 might be too low for the number of shaders and size of GPURAM
  2. the GPU kernel can only run briefly and has to wait for the CPU each time (patch cropping and memory paging)

I suggest changing the following:

  • Define a generator function doing the patching/cropping. It should be a thread-safe formulation, e.g. a keras.utils.Sequence.
  • Pass that to predict_generator instead of predict to get concurrent CPU / GPU computation.
  • Allow parameterizing the number of workers and batch size to allow optimal adaptation to the concrete hardware and crop/model sizes.

bertsky avatar Jun 10 '21 22:06 bertsky

Spoiler: I know how to do this. Would you care for a PR?

bertsky avatar Jun 20 '21 15:06 bertsky

Spoiler: I know how to do this. Would you care for a PR?

@bertsky I appreciate it if you do that :)

vahidrezanezhad avatar Jun 21 '21 08:06 vahidrezanezhad

@bertsky did you ever complete this improvement? Maybe on a fork? I would like to run this binarization on a large dataset and with the current procedure it is simply too slow (10-20 images per minute).

apacha avatar Aug 24 '22 11:08 apacha