layout-parser icon indicating copy to clipboard operation
layout-parser copied to clipboard

Retrieving text inside layouts.

Open phuynhh opened this issue 3 years ago • 2 comments

Hi, thank you very much for your brilliant work. I have successfully installed and run the parserlayout package on my win10. However, as I come from a non-computing/ data science background, I've currently been stuck on how to retrieve the text inside layouts and restore them into a dataframe for further analysis. Would you be able to provide any keywords or links about how to do the tasks? Any word will be very much appreciated. Thank you a lot.

phuynhh avatar Apr 15 '21 14:04 phuynhh

https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb

lotfiabdelghafour avatar Apr 15 '21 18:04 lotfiabdelghafour

Sure no problem! And thanks @lotfiabdelghafour for the pointer. I assume you are dealing with some image scans, so you might want to perform OCR after running the layout detection models. In this example, https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb you might find the some OCR examples at the very end. You could try with that first and and see it that's helpful.

Speaking of exporting, in the v0.2 release, we've just built the function to export a layout to dataframe: layout.to_dataframe(), see detail here.

lolipopshock avatar Apr 16 '21 05:04 lolipopshock