ChatMemory not working with LLM Streaming
Describe the bug In v1.0.0a24, AI component of chat memory is not logged when the LLM streams data back, having test on both OpenAI and Amazon Bedrock LLMs. If I switch off streaming the memory shows both User and AI interactions, but if Stream is turned on, the AI side is not present in the memory.
Browser and Version
- Brave
- Version 1.65.114
To Reproduce Steps to reproduce the behavior:
- Use the Memory Chatbot example
- On the OpenAI LLM, open Advanced settings and enable Stream
- Interact with the LLM, and observe the Text Outputs - whilst the most recent interaction is not added to the memory until a new request is sent in both Stream=true and =false, when streaming is enabled the AI side is not present in the memory.
Screenshots
Hey @mieslep
That is interesting.
Let's walk through this.
How do you expect this to behave?
Steps:
- Send a message (no history)
- AI streams a message (no history)
- You send another message (has history, should have two messages IF the filter is
Machine and User) - The Inspect Memory shows 3 messages (User, AI, User)
Is that correct?
Well, based on how it works with Stream=false (default), the Inspect Memory will show only two messages (User, AI).
Debatable is if the Inspect Memory should show the most recent User/AI interaction - that's not current behaviour with Stream=false, and wouldn't be "what was last sent to the LLM" but rather "what would be sent to the LLM"...
To my mind, this issue is a bug and the "how to deal with most recent interaction" is a different micro-feature...
I think it all depends on where the Inspect Memory sits in the flow.
If it is at the end of the flow, it will show all messages, if it sits before the prompt, it will show messages up to last round.
hmm, well the wiring of that component is a little odd-looking anyhow - the Text anchor is on the right, but ultimately it seems it is "wired" via a common "Session ID" and doesn't have anything other than an aesthetic positioning?
(regardless of the question of when the memory is updated, when Stream=true, the LLM is given none of its previous output...)