graphrag
graphrag copied to clipboard
A modular graph-based Retrieval-Augmented Generation (RAG) system
This would be implemented after Entity Resolution, generate pseudonyms for PERSON/ORG type entities.
* Extract entities with UniversalNER * Use LLM to extract relationships based on NER output
* Add config for wiring in Azure Insights * Report on some high-level stats in LLM jobs, push out to `stats.json` (e.g. generated tokens per minute, error rate, mean request...
Our CSV parsing tools support a pretty rich set of input parsing options, we should expose some of these into our CSV Input configuration model.
As an alternative to CSV, we should allow structured input from JSONL files, and use similar configs to CSV mode to drill down into text, title, timestamp, etc.. fields.