ocrd_detectron2 icon indicating copy to clipboard operation
ocrd_detectron2 copied to clipboard

make more efficient

Open bertsky opened this issue 3 years ago • 1 comments
trafficstars

We currently only use Detectron2's DefaultPredictor for inference: https://github.com/bertsky/ocrd_detectron2/blob/0272d95a930d5136bba29e530a3530c13ab17166/ocrd_detectron2/segment.py#L126

But the documentation says:

This is meant for simple demo purposes, so it does the above steps automatically. This is not meant for benchmarks or running complicated inference logic. If you’d like to do anything more complicated, please refer to its source code as examples to build and use the model manually

One can clearly see how the GPU utilization is scarce, so a multi-threaded implementation with data pipelining would boost performance a lot.

bertsky avatar Jan 21 '22 10:01 bertsky

The first try in predict-async does not actually reduce wall time (it only reduces CPU seconds a bit). Perhaps we must first disentangle the page loop (make it a pipeline).

However, https://github.com/bertsky/ocrd_detectron2/commit/88617a25d3f847d65e8260391b27fda45ae55987 (i.e. predicting and post-processing at lower pixel density – no more than 150 DPI) does help quite a bit already.

bertsky avatar Feb 03 '22 16:02 bertsky