Resume-Rater icon indicating copy to clipboard operation
Resume-Rater copied to clipboard

Trying to test run and train

Open sourcelead opened this issue 5 years ago • 1 comments

Getting following error during testing and training with pdf files

python3 main.py --type fixed "./src/data/test/Dong Xing_Catherine Zhang_Equity Research Intern.pdf" --model_name model Loading nlp tools... Loading pdf parser... 2019-06-13 12:32:38,162 [MainThread ] [WARNI] Tika server returned status: 500 Traceback (most recent call last): File "main.py", line 101, in r.test(path_to_resume, infoExtractor) File "/media/Shared/resume_Rat/Resume-Rater-master/src/model.py", line 568, in test doc, _ = loadDocumentIntoSpacy(filename, self.parser, self.nlp) File "/media/Shared/resume_Rat/Resume-Rater-master/src/utils.py", line 162, in loadDocumentIntoSpacy new_text = getPDFText(f, parser) File "/media/Shared/resume_Rat/Resume-Rater-master/src/utils.py", line 144, in getPDFText raw = parser.from_file(filename) File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/tika/parser.py", line 40, in from_file return _parse(jsonOutput) File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/tika/parser.py", line 77, in _parse realJson = json.loads(jsonOutput[1]) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/init.py", line 354, in loads return _default_decoder.decode(s) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

sourcelead avatar Jun 13 '19 12:06 sourcelead

i think it's a Tika parser problem. I did not want to use Tika because of the need to interface with Java but sadly other methods require a lot of dependencies. I think you can try restarting your Tika server or maybe upgrade your Python to 3.7.

Also, Tika requires the Internet (unfortunately) so it is possible you might have not connected to Apache Tika.

ongteckwu avatar Jun 13 '19 13:06 ongteckwu