grobid icon indicating copy to clipboard operation
grobid copied to clipboard

High network usage?

Open jacksongoode opened this issue 2 years ago • 1 comments

Hi again,

Wondering if there was a change from 0.6.2 to 0.7 that led to an increase in network utilization? I actually thought there was no network needed for the xml production but when I checked my system usage there was a fairly high upload happening. Is this apart of the consolidation step? It would be great to detail this info as well as the possibility to allow this run offline.

But now I'm realizing this could be local network activity of course :upside_down_face:

jacksongoode avatar Jan 01 '22 06:01 jacksongoode

Hello @jacksongoode !

Consolidation of headers is enabled by default (this is helpful for accuracy/quality). I think 0.7.0 by default is using https://cloud.science-miner.com/glutton which was more reliable than CrossRef web API. This explains the network usage. However this online demo is not really used in a courteous manner and now frequently overloaded so it might possibly explain your new pain for tests.

About running offline, this is documented. Simply add the consolidation parameters to 0 when calling the service as described (or if you use the command line, use the corresponding documented arguments).

Good luck !!

kermitt2 avatar Jan 01 '22 20:01 kermitt2