camelot
camelot copied to clipboard
[WIP] Add OCR support
trafficstars
Closes #14
Output on this image-based PDF (much better than vanilla tesseract):
0 1 2 3 4 5 6
0 u@ta nictance Percent Fu Savings el
1 Name (1lkm) (mi) Improved Speed Decreased Accel Eliminate Stops Decreased Idle
2 2012 2 3.30 1.3 5.9% 9.5% 29.2% 17.4%
3 2145 1 0.68 11.2 2.4% 0.1% 9.5% 2.7%
4 4234 1 0.59 58.7 8.5% 1.3% 8.5% 3.3%
5 2032 2 0.17 57.8 21.7% 0.3% 2.7% 1.2%
6 4171_1 0.07 173.9 58.1% 1.6% 2.1% 0.5%
Checklist:
- [x] Add LatticeOCR
- [x] Handle spanning cells
- [ ] Add StreamOCR
- [ ] Update docs
is PR applicable for non-searchable PDFs?
@rajasekharponakala What do you mean by MR?
@vinayak-mehta, oops, edited.