mem0 What are the implications of allowing more documents as context?

Let's talk about this method:

def query(self, input_query):
        """
        Queries the vector database based on the given input query.
        Gets relevant doc based on the query and then passes it to an
        LLM as context to get the answer.

        :param input_query: The query to use.
        :return: The answer to the query.
        """
        result = self.collection.query(
            query_texts=[input_query,],
            n_results=1,
        )
        result_formatted = self._format_result(result)
        answer = self.get_answer_from_llm(input_query, result_formatted[0][0].page_content)
        return answer

As far as I can tell, (and I'm just reading, not necessarily understanding, correct me if I'm wrong), it will return the one single closest document. n_results=1

What if we have a more granular database, cut into smaller pieces?

E.g. the webpages and documents we added are only a paragraph long. Then it will only return that one paragraph. So let's keep imagining that a user asks a complex question for which the correct answer is stored in more than one document. Then it would only answer part of the question with limited knowledge.

Here's a simple example. Let's say we are in the car business and feed our database information about the Corvette, one page for each generation. Then a user asks how much does horsepower does the current Corvette make and how much did the first one make?. If my understanding is correct, it could not answer that question (for this specific question, ChatGPT knows the answer out of the box, but you get the point).

For these kinds of use cases I'm proposing to allow the retrieval of more than one document, configurable by the user. 1 can stay as the default. These are then all passed as context so a LLM can do it's magic and process the information.

The downside I can see is that it will require more tokens, and thus cost more. This is a compromise the user has to make for better results. ~~The max token limit should also be considered, especially in cases where the database contains short and long text, for this edge case, max tokens should be configurable by the user, and in case a limit is set, the tokens of the prompt should be counted and cut off if necessary.~~ edit: openai has a max tokens parameter that does all of this

P.S. Why are we prompting with prompt = f"""Use the following pieces of context to answer the query at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. {context} if we just use one piece of context.

I will propose a PR for his.

Jun 23 '23 13:06 cachho

Hey, thanks for opening the issue. Let me think more and get back to you on this.

Jun 23 '23 14:06 taranjeet

closed with #63

Jul 11 '23 14:07 cachho