tesseract
tesseract copied to clipboard
This might be the fix for the upper disappearing word of issue 3871
https://github.com/tesseract-ocr/tesseract/issues/3871
This does not work for -c edges_use_new_outline_complexity=1
It might be this 'fix' revives the 'rejected parent' from the bucket, while other rejected blobs should probably still be available and are killed with the empty word as well. I think the children are mistrusted, as those blobs come by as well making the row, including the dot on the i. As the new layers of tesseract even know how to revive a parent I guess they could cope better with those mistrusted children as well. I'm trying to get to the right point in the debugger...
I changed the strategy of keeping the parent alive to just kill the parent as soon as it is known. That solves the issues on both lines. I wonder whether other examples can reveal a motivation for reviving the parent to a lesser extent.