grobid icon indicating copy to clipboard operation
grobid copied to clipboard

A machine learning software for extracting information from scholarly documents

Results 227 grobid issues
Sort by recently updated
recently updated
newest added

Hi @kermitt2, I've noticed that list items are excluded from being labeled by the fulltext model. Are you interested in their implementation? Maybe you are considering to put them into...

I ran processFulltextDocument on 22103 arXiv PDFs. 22053 PDFs succeeded and 50 failed. Running on MacOS M2 chip Java version: 17.0.10 Server started with Gradle (`./gradlew run`) An example error...

bug
implemented

Hi mighty developers I am using GROBID for research which I need to extract text (processFulltextDocument) from some company annual report PDF files. I know GROBID is designed for academic...

This is an error case not to forget that causes some trouble with the sentence segmentation. The document is not CC-BY, referenced here: https://dx.doi.org/10.1063/1.1874292 Here the `delinquent` paragraph: With version...

bug
implemented

I have used a Java client in my maven project. The problem is that URL where grobid-core is located doesn't exist any more on this link: https://grobid.s3.eu-west-1.amazonaws.com/repo Please, upload or...

Hey so I'm running grobid on my Mac as a rest service and on a batch of about 400 documents, a couple of them have this error (file attached). [error.txt](https://github.com/kermitt2/grobid/files/407876/error.txt)...

macOS-specific

Hi I am new to Grobid and really need help I am trying to extract the section headers and while they do appear normally in the tag, it does not...

When we search for a DOI in the page, the regex may truncate DOIs that are split by a breakline, so this PR proposes a simple fix that is to...

bug

I've noticed that there are some cases where the DOI is correctly extracted from the article header, however, they are incorrectly mangled in the output. Example: [origin9833693929434438741.pdf](https://github.com/user-attachments/files/15757039/origin9833693929434438741.pdf) In this article...