grobid
grobid copied to clipboard
High network usage?
Hi again,
Wondering if there was a change from 0.6.2 to 0.7 that led to an increase in network utilization? I actually thought there was no network needed for the xml production but when I checked my system usage there was a fairly high upload happening. Is this apart of the consolidation step? It would be great to detail this info as well as the possibility to allow this run offline.
But now I'm realizing this could be local network activity of course :upside_down_face:
Hello @jacksongoode !
Consolidation of headers is enabled by default (this is helpful for accuracy/quality).
I think 0.7.0
by default is using https://cloud.science-miner.com/glutton which was more reliable than CrossRef web API. This explains the network usage.
However this online demo is not really used in a courteous manner and now frequently overloaded so it might possibly explain your new pain for tests.
About running offline, this is documented. Simply add the consolidation parameters to 0
when calling the service as described (or if you use the command line, use the corresponding documented arguments).
Good luck !!