sbb_binarization
sbb_binarization copied to clipboard
use predict_generator to better utilize GPU
When the model is applied in patch mode (the default), a loop over the windows is run (on CPU / in Numpy) and passed to model.predict() as a single image each (on GPU / in Keras).
https://github.com/qurator-spk/sbb_binarization/blob/8dd05064b2dbdc7d4bdfb8896251302e8ec5ecb3/sbb_binarize/sbb_binarize.py#L152
This does not utilize the GPU for two reasons:
- the effective batch size of 1 might be too low for the number of shaders and size of GPURAM
- the GPU kernel can only run briefly and has to wait for the CPU each time (patch cropping and memory paging)
I suggest changing the following:
- Define a generator function doing the patching/cropping. It should be a thread-safe formulation, e.g. a
keras.utils.Sequence. - Pass that to
predict_generatorinstead ofpredictto get concurrent CPU / GPU computation. - Allow parameterizing the number of workers and batch size to allow optimal adaptation to the concrete hardware and crop/model sizes.
Spoiler: I know how to do this. Would you care for a PR?
Spoiler: I know how to do this. Would you care for a PR?
@bertsky I appreciate it if you do that :)
@bertsky did you ever complete this improvement? Maybe on a fork? I would like to run this binarization on a large dataset and with the current procedure it is simply too slow (10-20 images per minute).