kleineanfragen icon indicating copy to clipboard operation
kleineanfragen copied to clipboard

Improve table recognition

Open robbi5 opened this issue 9 years ago • 0 comments

Currently, table recognition is a simple check for some keywords in app/jobs/contains_table_job.rb.

We could improve this by looking for some obvious table like patterns like:

November 2013 43.104 

Dezember 2013 30.419 

Januar 2014 29.218 

Februar 2014 15.598 

(from https://kleineanfragen.de/berlin/17/14442)

Additionally we could use the table recognition from tabula: tabula-extractor / tabula-java

robbi5 avatar Dec 15 '15 17:12 robbi5