tika-python icon indicating copy to clipboard operation
tika-python copied to clipboard

RuntimeError: Unable to start Tika server.

Open mhrihab opened this issue 4 years ago • 1 comments

I created a function that parses a PDF file using TIKA in a service and when I tried to dockerize it, it displays this error : parse_pdf(tmp_path)

File "/app/process.py", line 90, in parse_pdf

data = parser.from_file('document-page' + str(i) + '.pdf', headers=headers)

File "/usr/local/lib/python3.8/site-packages/tika/parser.py", line 40, in from_file

output = parse1(service, filename, serverEndpoint, headers=headers, config_path=config_path, requestOptions=requestOptions)

File "/usr/local/lib/python3.8/site-packages/tika/tika.py", line 336, in parse1

status, response = callServer('put', serverEndpoint, service, f,

File "/usr/local/lib/python3.8/site-packages/tika/tika.py", line 531, in callServer

serverEndpoint = checkTikaServer(scheme, serverHost, port, tikaServerJar, classpath, config_path)

File "/usr/local/lib/python3.8/site-packages/tika/tika.py", line 601, in checkTikaServer

raise RuntimeError("Unable to start Tika server.")

RuntimeError: Unable to start Tika server.

I couldn't fix this error, I am using tika==1.24 and FROM tiangolo/uvicorn-gunicorn-fastapi:python3.9

mhrihab avatar Aug 17 '21 16:08 mhrihab

"To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background." You need to install java in a container: RUN apt-get install -y default-jdk

Horasachy avatar Aug 23 '21 12:08 Horasachy

correct @Horasachy

chrismattmann avatar Dec 31 '22 21:12 chrismattmann