grobid icon indicating copy to clipboard operation
grobid copied to clipboard

A machine learning software for extracting information from scholarly documents

Results 227 grobid issues
Sort by recently updated
recently updated
newest added

In [this document](https://www.sciencedirect.com/science/article/pii/S0305750X17304059), the first paragraph in the Introduction, the string `consequence of the civil war.^2` (where ^2 is the 2 as superscript), the text appears as "consequence of the...

Hi, Thanks for an awesome application! When training our module in a VM the max CPU usage seems to be constant. I've tried to change max number of threads in...

question

Hi, Is there a way to get only sentences coords, or paragraph coords, without other parsing of grobid? I need only plain text + coords of pdf. Thanks!

question

I am using GROBID to convert bioRxiv preprints PDFs to XML and finding that paragraph content with the heading 'Funding' is not being captured in the TEI XML output (version...

bug
enhancement

In the [first pubmed evaluation manuscript](https://dx.doi.org/10.1208%2Fs12248-011-9260-2), a number of times 'α2-integrin' is at a line break, e.g.: "was mediated through the inhibition of expression of α2- integrin (1,2). Integrins are...

bug
enhancement

I am trying to crop the images from the pdf by using the coords attributes inside the graphic elements but it looks like the graphic elements won't be generated unless...

how do i call an API, like what, or how do i write the API URL to call for the GROBID service so that it process my headerdocument send me...

question

example paper: https://journals.aps.org/prc/abstract/10.1103/PhysRevC.100.014306 (same with #781 ) ### Fig 1(missed) ![image](https://user-images.githubusercontent.com/8749411/123935151-8ac74800-d9c6-11eb-8183-4c694b67c2c4.png) ### Fig 2(wrong head and figDesc) ```xml B and the 1 + 1 0 11 state of 10 B....

bug

I'm trying to train Korean article and English article together. I generated training data and tagged as written in the grobid document. Other things (like title, reference title, journal, abstract...