grobid_client_python icon indicating copy to clipboard operation
grobid_client_python copied to clipboard

Notify user of current job's progress in output

Open jacksongoode opened this issue 3 years ago • 5 comments

How would one go about running client.process and then continuing once complete? It seems anything after the process is discarded.

jacksongoode avatar May 20 '21 21:05 jacksongoode

Hello @jacksongoode ! Not sure I understand the question... this is a client and the GROBID server remain "warm". The client just sends PDF and gets back XML, what exactly would be hold by a client here?

kermitt2 avatar May 20 '21 22:05 kermitt2

Ahh, I see. I managed to get everything working but was confused with the lack of output even with the verbose flag. Would it be possible to capture the status of the current job through the python client?

jacksongoode avatar May 22 '21 00:05 jacksongoode

Would it be possible to capture the status of the current job through the python client?

Yes sure, we could extend the "verbose" mode to make it more readable and useful. Which information would like to see?

We could prefix by file name/path and indicate "sent", "output written", things like that maybe? But usually queries are in parallel and pretty fast, it might be a console mess.

In another issue we discussed having a progress bar, but it means counting the files in a first pass and thus slowing down a bit the process, in particular if we consider folder with millions of PDF (which is a real world usage in my case). It could be optional?

kermitt2 avatar May 22 '21 00:05 kermitt2

Yes, I think something along the lines of a tqdm style progress bar would be really nice. I'm currently working with ~2k PDFs so printing each to console would be a mess.

But for a lot of users, the long pause in the script might causes some concern if they aren't aware that Grobid is doing its job.

jacksongoode avatar May 24 '21 19:05 jacksongoode

In addition to this feature, I am also curious if it makes sense to suppress the output when the file exists unless verbose is specified?

jacksongoode avatar Jan 20 '22 22:01 jacksongoode