tika-python
tika-python copied to clipboard
Tika server 2.9.1 Pdf tesseract Ocr
Hello, The beginner that i am need your help, i use tika server to extract meta and text with ocr strategy auto on native pdf documents no problem as thé process Time is low but on scanned pdf files (hundreds pages) i hit the timeout of thé request throught python or curl. Is their a way to config tika-config.yml file to make the thé ocr process all the pages with strategy auto. Thks in advance.