tabula-java
tabula-java copied to clipboard
point tabula at table area
along the lines of #151: can we try to help find tabula the area of the table to improve results? maybe a combination of regex and some computer vision (i.e. where on the page are those long horizontal lines) can do the trick?
I have a project where I have to find tables in 100000s of pds, and while my current approach (draw bounding boxes directly on the pdf) gives good results, I need to consider fully automated ways approaches
if no-one has started this, I'd be happy to try my luck with some CV....