ontogpt
ontogpt copied to clipboard
LLM-based ontological extraction tools, including SPIRES
BioRED evaluation. This evaluation measures performance of OntoGPT on relation extraction over the BioRED data set (see Luo et al. 2022, https://doi.org/10.1093/bib/bbac282).
Hi, Is there any plan to pass openai parameters in the extraction? For example, the temperature parameter.
Hello, I'm looking for a way to perform grounding using a domain-specific ontology stored locally in .owl format. Is it possible? If so how to do it? Thank you!
Primarily of relevance to the GPT-16k models (see also #133 ), but also to any model with a context limit >4k. Limits are currently hardcoded in may places, but shouldn't...
Extraction (SPIRES) output should include model name (including source), time, and date. It may be useful to include whether the query was run vs. live API or vs. cached result,...
Running the following command: ``` $ ontogpt -vvv pubmed-annotate -t gocam --get-pmc --limit 1 "27849154" ``` results in an error: ``` ... INFO:ontogpt.engines.knowledge_engine:GROUNDING HSN using Gene INFO:ontogpt.engines.knowledge_engine: Annotators: ['gilda:', 'sqlite:obo:pr']...
Pydantic versions of the extraction templates are generated through linkml's `gen-pydantic`, and that's currently called through the Makefile doing this: ``` poetry run gen-pydantic --pydantic_version 2 $< > $@ ```...
This would apply for all SPIRES extractions, but particularly for PubMed extractions. It would be useful to know summary statistics, e.g., how many entities, how many relations, how many grounded...
@enockniyonkuru suggests including a threshold on PubMed extraction outputs, such that we can select for documents yielding the same entities or relations (e.g., specify which results appear in 3 or...
It's a bit awkward to pass pre-compiled corpora of documents (as lists of PMIDs) to the PubMed-based extractor. It could be easier just to provide an input file with these...