sbb_binarization
sbb_binarization copied to clipboard
OCR-D processor is leaky
When processing a document of 1.5k pages of medium size (1-2 MP each), I am observing a slow but steady increase in RSS from 4 GB up to 14 GB after 1.2k pages at which point the process gets crashed by the OS (Killed
).
I do not see any Python bindings accessible to the input file loop which could accumulate such data without ever being GCed.
I am on CUDA 11.8
Has anybody seen this before?