grobid topic
scipdf_parser
Python PDF parser for scientific publications: content and figures
sciencebeam-parser
A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools together to generate a full XML document.
papercast
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines...
grobid-superconductors
Grobid module for superconductor material and properties extraction
structure-vision
Viewer for the structure extracted by Grobid on PDF documents
Biomedical-Knowledge-Graph
Information extraction from unstructured text to build a knowledge graph using techniques from traditional NLP to pre-trained transformers and LLMs for NER and Linking, and Relation Extraction.