akeru
akeru copied to clipboard
Support streaming as part of thread runs and LLM generations
Problem Statement
- Based on a previous PR #9, we have managed to introduce the concept of
thread
runs, where we await for a response based on the content of the thread. - We should have the option to stream the answer back to the consumer of the api to accommodate for the slow response time of LLMs.
- While the adapters are currently generators, we have not supposed streaming over the internet connection yet.