leptonica
leptonica copied to clipboard
Question : Is it possible to replicate tesseract document page segmentation using leptonica APIs?
I am trying to do document page segmentation using leptonica. I am struck after trying some basic page segmentations techniques. Then I came across this tesseract's paper https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35094.pdf
Is it possible to implement the above procedure only using leptonica?
Thanks.
It's a good approach, a bit more complicated than the leptonica functions in pageseg.c, but more accurate. The functions in pageseg.c are illustrative of the approach. You can also look at programs in the prog directory, such as livre_pageseg and anything starting with pageseg there, which do similar things.
And yes, it certainly is possible to implement Ray's page segmentation algorithm entirely in leptonica. It would take a bit of work, but the necessary data structures exist.
Thanks for the reply. Yes, I have seen the pageseg.c and files under prog directory. I am working on making it to work with any document image in general. Will try the ray's method and get back to you in case of any issues. 👍
@balachandarsv, the page segmentation of Tesseract has many weaknesses, especially with complex layouts. If you get a better segmentation using Leptonica it would be great to see your solution on GitHub.
@stweil Definitely. I would open source the code for segmentation once i have a decent version. In the meantime, i am also trying to find the existing solutions for page segmentations, so that i can start my learning from there and take it forward. If you have some samples which you tried or came across, please do share.