camelot icon indicating copy to clipboard operation
camelot copied to clipboard

[WIP] Add OCR support

Open vinayak-mehta opened this issue 5 years ago • 3 comments
trafficstars

Closes #14

Output on this image-based PDF (much better than vanilla tesseract):

        0       1         2               3                4                5               6
0    u@ta          nictance                       Percent Fu       Savings el                
1    Name  (1lkm)      (mi)  Improved Speed  Decreased Accel  Eliminate Stops  Decreased Idle
2  2012 2    3.30       1.3            5.9%             9.5%            29.2%           17.4%
3  2145 1    0.68      11.2            2.4%             0.1%             9.5%            2.7%
4  4234 1    0.59      58.7            8.5%             1.3%             8.5%            3.3%
5  2032 2    0.17      57.8           21.7%             0.3%             2.7%            1.2%
6  4171_1    0.07     173.9           58.1%             1.6%             2.1%            0.5%

Checklist:

  • [x] Add LatticeOCR
  • [x] Handle spanning cells
  • [ ] Add StreamOCR
  • [ ] Update docs

vinayak-mehta avatar Nov 08 '20 23:11 vinayak-mehta

is PR applicable for non-searchable PDFs?

rajasekharponakala avatar Mar 22 '21 13:03 rajasekharponakala

@rajasekharponakala What do you mean by MR?

vinayak-mehta avatar Jun 27 '21 19:06 vinayak-mehta

@vinayak-mehta, oops, edited.

rajasekharponakala avatar Jun 28 '21 01:06 rajasekharponakala