langchain icon indicating copy to clipboard operation
langchain copied to clipboard

VectorStoreRetrieverMemory does not respect input_key, stores additional keys

Open feffy380 opened this issue 1 year ago • 0 comments

The current implementation only excludes inputs matching the memory key. When using CombinedMemory, there will be multiple keys and the vector store memory will save everything except the memory_key. This is unwanted because in my case the other key includes the entire chat history.

This seems to be the relevant function in VectorStoreRetrieverMemory:

    def _form_documents(
        self, inputs: Dict[str, Any], outputs: Dict[str, str]
    ) -> List[Document]:
        """Format context from this conversation to buffer."""
        # Each document should only include the current turn, not the chat history
        filtered_inputs = {k: v for k, v in inputs.items() if k != self.memory_key}
        # <snip>

Example:

template = (
    "Relevant pieces of previous conversation:\n"
    "=====\n"
    "{documents}\n"
    "=====\n"
    "Chat log:\n"
    "{history}\n\n"
)
buffer_memory = ConversationTokenBufferMemory(
    input_key="input",
    memory_key="history",
    llm=llm,
)
vector_memory = VectorStoreRetrieverMemory(input_key="input", memory_key="documents", retriever=retriever)
combined_memory = CombinedMemory(memories=[vector_memory, buffer_memory])

Current behavior: the vector store memory saves input and history Expected behavior: respect input_key and only save input in the vector store (in addition to the response)

For comparison, when input_key is specified, ConversationTokenBufferMemory only saves inputs[input_key] as expected.

feffy380 avatar Apr 30 '23 22:04 feffy380