feat: stream LLM responses
Only databricks provider to start, but should not be hard to do this for others that support streaming.
https://github.com/user-attachments/assets/6fa08fc9-38cb-4b68-a336-7093d2614650
@jamadeo looks really great - but there is a bug where it seems to not call tools correctly
so may need to chase that down
@jamadeo looks really great - but there is a bug where it seems to not call tools correctly
Thanks @michaelneale, forgot to mention here but I fixed this issue. My first stab didn't re-assemble streamed tool call messages (they are chunked too), but I've since added it.
Thanks @baxen for the review! I actually effectively reverted the subagent change because it doesn't really help anything to use streaming in that context. If we, in the future, want to show partial text responses from models to subagents in a streamed way we can think about it then, but no reason to fit it in now.
@jamadeo oops thought it would be an easier merge than that to update it, but looking good. Sorry have slightly broken it (looking at it) - feel free to yank that last merge if it is all broken and not fixed by your morning time.
I think some work still needed:
using databricks, when it finishes streaming never goes back to the prompt (even if I press enter)
have been testing this with CLI + databricks and seems ✅