gramps-web-api
gramps-web-api copied to clipboard
Teaching the AI assistant to call tools
(NB: this is a feature request, but also the start of a discussion - I think we need some good ideas first.)
Currently, the AI assistant is not very smart as it can only retrieve individual Gramps objects and doesn't know anything about relationships, so you can't even ask it for your grandfather.
To solve that, we need to teach it how to call tools/functions.
In approaching that, there are several questions to answer:
- which functions should it call?
- how (if at all) can we make the tool calling not just work in OpenAI models, but also in open source models for people running chat locally?
One challenge I see is that the number of possible functions is quite large:
- retrieve a person by some filter
- retrieve an event by some filter
- find people with a certain relationship
- ...
Although I haven't tried it myself yet, common lore is that for an LLM to identify the right function to call only works well if the number of functions is small, probably below 10.
What I find quite promising is leveraging query languages like GQL or @dsblank's Object QL, where I suspect the latter is a better choice.
What could be done is the following:
- Create a large number of possible queries that are considered useful for the assistant
- Describe what the query does
- Use an LLM to generate questions based on the description of what the query does
- Use an embedding model to compute vector embeddings for each of the questions and store them with the query
Now, with these embeddings at hand, when the assistant gets a question, it could
- Calculate the embedding for the question with the same embedding model used for the query language questions
- Use vector similarity to identify the 5 most likely queries
- Feed these 5 queries as function calls to the LLM and let it decide which function to use
- Execute the query recommended by the LLM and feed the results back to the LLM
- Generate the answer
Funnily enough, this would even be less resource intensive than the retrieval-based answers, since it only needs a vector index of queries that can be computed in advance once and for all.
I don't think I'll have time to work on this myself in the next 2 months or so, but if anyone experiments with this or has other ideas, please share here!
🤖
Nice summary of the issues. I hope to play around with some of these ideas soon. Might even be a research topic!
A very nice use of a LLM would be process a Note (such as a transcription of an obituary) for proper names and offer a linking list of people within 2 degrees of separation. Then push the remainder of unmatched name through Doug Blank's Data Entry gramplet. (Which allows quickly adding new Parents, sibs and children.)
A very nice use of a LLM would be process a Note (such as a transcription of an obituary) for proper names and offer a linking list of people within 2 degrees of separation. Then push the remainder of unmatched name through Doug Blank's Data Entry gramplet. (Which allows quickly adding new Parents, sibs and children.)
Something else I've been using manually a lot is taking the output of the OCR text recognition and making the LLM fix all the typos. In combination, this leads to almost perfect transcriptions.
Maybe another way to make assistant smarter will be add access to all tree data with descriptions how to use it, like this tool https://github.com/mindsdb/mindsdb
Maybe most simple way - add tools for LLM implemented as calls to Gramps for different actions - find someone or get info like Childs count for given person(by id) https://python.langchain.com/docs/how_to/tool_calling/
Related: https://gramps.discourse.group/t/release-gramps-web-mcp-v1-0-connect-ai-assistants-to-your-family-tree/8541
(EDIT: wrong link corrected)
Implementing GraphRag seems like the natural choice rather than a VectorStore
See also https://github.com/cabout-me/gramps-mcp/discussions/21
Related: #720
Comment to self: I think the second step in my original post (calculating embeddings of tools) is not necessary anymore given the large context size of modern LLMs. My new attempt is to use Pydantic AI to define tools and let the library handle the prompt magic.