khoj icon indicating copy to clipboard operation
khoj copied to clipboard

Add better support for meta-level/summarization questions

Open sabaimran opened this issue 1 year ago • 1 comments

One limitation of Khoj currently is that it's not very good at answering meta-level question. For example, "How many notes do I have about Spanish lessons?" or "What was the first issue related to offline models in the Khoj repository?" won't be very effective. More categorical, specific questions however do perform well.

This may need re-thinking of how we index data. Now that we're introducing a Postgres backend (refer to efforts in the PRs from #487 and others tagged with [Multi-User]), we could add an extra layer for the LLM to run any subqueries that would be necessary for executing this request.

For example, "How many notes do I have about Spanish lessons?" can be converted to something like SELECT * FROM database_embeddings WHERE distance <0.1, where distance is computed between the vectorized column and the query string.

I do think this would generally be useful for any kind of meta-analysis.

See relevant discussion in Discord.

sabaimran avatar Oct 19 '23 18:10 sabaimran

Or would a memgpt style approach of memory be a better way to go? (see memgpt github code)

FetchFast avatar Oct 21 '23 15:10 FetchFast