grobid
grobid copied to clipboard
A machine learning software for extracting information from scholarly documents
In [this document](https://www.sciencedirect.com/science/article/pii/S0305750X17304059), the first paragraph in the Introduction, the string `consequence of the civil war.^2` (where ^2 is the 2 as superscript), the text appears as "consequence of the...
Hi, Thanks for an awesome application! When training our module in a VM the max CPU usage seems to be constant. I've tried to change max number of threads in...
Hi, Is there a way to get only sentences coords, or paragraph coords, without other parsing of grobid? I need only plain text + coords of pdf. Thanks!
A follow up on #854.
I am using GROBID to convert bioRxiv preprints PDFs to XML and finding that paragraph content with the heading 'Funding' is not being captured in the TEI XML output (version...
In the [first pubmed evaluation manuscript](https://dx.doi.org/10.1208%2Fs12248-011-9260-2), a number of times 'α2-integrin' is at a line break, e.g.: "was mediated through the inhibition of expression of α2- integrin (1,2). Integrins are...
I am trying to crop the images from the pdf by using the coords attributes inside the graphic elements but it looks like the graphic elements won't be generated unless...
how do i call an API, like what, or how do i write the API URL to call for the GROBID service so that it process my headerdocument send me...
example paper: https://journals.aps.org/prc/abstract/10.1103/PhysRevC.100.014306 (same with #781 ) ### Fig 1(missed) ![image](https://user-images.githubusercontent.com/8749411/123935151-8ac74800-d9c6-11eb-8183-4c694b67c2c4.png) ### Fig 2(wrong head and figDesc) ```xml B and the 1 + 1 0 11 state of 10 B....
I'm trying to train Korean article and English article together. I generated training data and tagged as written in the grobid document. Other things (like title, reference title, journal, abstract...