khoj
khoj copied to clipboard
Add better support for meta-level/summarization questions
One limitation of Khoj currently is that it's not very good at answering meta-level question. For example, "How many notes do I have about Spanish lessons?"
or "What was the first issue related to offline models in the Khoj repository?"
won't be very effective. More categorical, specific questions however do perform well.
This may need re-thinking of how we index data. Now that we're introducing a Postgres backend (refer to efforts in the PRs from #487 and others tagged with [Multi-User]), we could add an extra layer for the LLM to run any subqueries that would be necessary for executing this request.
For example, "How many notes do I have about Spanish lessons?"
can be converted to something like SELECT * FROM database_embeddings WHERE distance <0.1
, where distance
is computed between the vectorized column and the query string.
I do think this would generally be useful for any kind of meta-analysis.
Or would a memgpt style approach of memory be a better way to go? (see memgpt github code)