grobid_client_python
grobid_client_python copied to clipboard
Notify user of current job's progress in output
How would one go about running client.process and then continuing once complete? It seems anything after the process is discarded.
Hello @jacksongoode ! Not sure I understand the question... this is a client and the GROBID server remain "warm". The client just sends PDF and gets back XML, what exactly would be hold by a client here?
Ahh, I see. I managed to get everything working but was confused with the lack of output even with the verbose flag. Would it be possible to capture the status of the current job through the python client?
Would it be possible to capture the status of the current job through the python client?
Yes sure, we could extend the "verbose" mode to make it more readable and useful. Which information would like to see?
We could prefix by file name/path and indicate "sent", "output written", things like that maybe? But usually queries are in parallel and pretty fast, it might be a console mess.
In another issue we discussed having a progress bar, but it means counting the files in a first pass and thus slowing down a bit the process, in particular if we consider folder with millions of PDF (which is a real world usage in my case). It could be optional?
Yes, I think something along the lines of a tqdm style progress bar would be really nice. I'm currently working with ~2k PDFs so printing each to console would be a mess.
But for a lot of users, the long pause in the script might causes some concern if they aren't aware that Grobid is doing its job.
In addition to this feature, I am also curious if it makes sense to suppress the output when the file exists unless verbose
is specified?