kernel-memory
kernel-memory copied to clipboard
Implement new streaming ask endpoint (WIP)
Motivation and Context (Why the change? What's the scenario?)
Draft PR for issue https://github.com/microsoft/kernel-memory/issues/100 . This draft pr was very briefly discussed in discord. The changes still need work. currently missing:
- more examples (serverless, service, curl,...)
- ~~Get the streaming working in WebClient, it currently still buffers the response (i was not able to find a fix)~~
High level description (Approach, Design)
Used existing AskAsync method as reference point for most of the code. The new streaming endpoint does not return a MemoryAnswer, but returns the actual text result in an async enumerable. If sources are needed the SearchAsync method can still be used to fetch them with minimal extra performance overhead.