grobid_client_python
grobid_client_python copied to clipboard
No files written to output directory & no API call
Hi! I'm trying to use GROBID to generate the .tei.xml file for some pdfs. I have installed GROBID according to the docs on my local machine using the following commands:
wget https://github.com/kermitt2/grobid/archive/0.6.0.zip
unzip 0.6.0.zip
and since I do not have many pdfs that I want to annotate I chose to use the public GROBID service http://cloud.science-miner.com/grobid/ and updated the config.json file accordingly. However when I run the commands below nothing is written to the OUTPUTS file and no API call is made either. I also checked the web service and when I tried to process a full text document there, I get a 503 error saying that the processFulltextDocument service is not available. None of the other services (processHeader, etc.) are available either.
xxx grobid-client-python % python3 grobid-client.py --input ~/Desktop/GROBID/test --output ~/Desktop/GROBID/OUTPUTS --config ~/Desktop/GROBID/grobid-client-python/config.json --force processFulltextDocument
GROBID server is up and running
2 PDF files to process
/Users/samanthadalal/Desktop/GROBID/test/NeuralRegenRes12122021-5821355_161013.pdf
/Users/samanthadalal/Desktop/GROBID/test/nmc-59-213.pdf
runtime: 13.629 seconds
xxx grobid-client-python % cd ..
xxx GROBID % cd OUTPUTS
xxx OUTPUTS % ls
xxx OUTPUTS %
Could you please help me resolve this issue, I would very much appreciate it! Thank you :)
Hi @samanthadalal, the public Grobid service was overloaded. I restarted it to clean the queue, you can try now, but it might be saturated again because some people launched heavy batches on it. I should probably reduce the quotas. In any cases, if you want to process files with some safety, better to use a local install - it's not complicated, it works on low profile hardware.