grobid_client_python
grobid_client_python copied to clipboard
grobid-client.py failing after several files repeatedly but on different files
I am running Grobid Server 0.5.6 in command line on a practice folder with 127 pdfs (as a practice run before running on 80,000+ pdfs. After a few pdfs (each attempt on the same folder returns a different number of tei files), it fails and gives me the following error:
Task :grobid-service:run FAILED FAILURE: Build failed with an exception. * What went wrong:
Execution failed for task ':grobid-service:run'. > Process 'command 'C:\Program Files\Java\jdk1.8.0_231\bin\java.exe'' finished with non-zero exit value 1
Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0. Use '--warning-mode all' to show the individual deprecation warnings. See https://docs.gradle.org/5.4.1/userguide/command_line_interface.html#sec:command_line_warnings BUILD FAILED in 1m 20s
6 actionable tasks: 1 executed, 5 up-to-date
I can restart the server and run the program again and it works for a few files and then fails again. Following, https://github.com/kermitt2/grobid-client-python/issues/2, I switched to the GROBID public demo server and it seemed to work, however, not all the tei files were created in my output folder. I had to run the program 4 consecutive times to have all 127 files in the output folder. It correctly skipped over the files that it already converted. Since it runs and I am able to eventually convert all the files in the folder, I'm not sure where I am messing up. I'm hoping to be able to run this on 80,000+ pdfs so it would be nice to not have to worry about these failures.
I originally thought it was a corrupt file on my end, but it seems to fail at different files each time I run it. Any thoughts?
Thank you!