camelot icon indicating copy to clipboard operation
camelot copied to clipboard

[Feature Request / Question] Use different OCR engine

Open artt opened this issue 1 year ago • 1 comments

I find Apple's OCR (through ocrmac much more reliable than PDF Miner, especially with Thai script. It already outputs text and its rectangular boundaries. Wondering if it's possible to specify a custom OCR engine or how hard would it be to incorporate this feature. Thanks!


I mistakenly labeled this as bug but have no way to edit this. Sorry.

artt avatar Mar 15 '24 17:03 artt

Hey!

As https://github.com/camelot-dev/camelot/issues/343, we try to build a maintained fork at pypdf_table_extraction.

The closest thing to an alternative ocr engine was here: https://github.com/camelot-dev/camelot/pull/209 But it has never been finished.

Please open an issue/pr in the new repo if you like to discuss this further.

bosd avatar Aug 11 '24 19:08 bosd