burr
burr copied to clipboard
Capture TTFT with streaming
Is your feature request related to a problem? Please describe. TTFT is not captured for streaming & async streaming.
Describe the solution you'd like We should capture TTFT if it's a streaming action and add it to latency view.
Describe alternatives you've considered N/A
Additional context This is a framework and UI change.
Options:
- Encode it in a step's log -- capture:
- time of start
- time of first "token" (generator first yield)
- time of last "token"
- Number of tokens
- Encode it as an attribute
Either way, we'll need the following hooks:
post_stream_startafter we initialize the streampost_stream_step(index)after every yield -- would count, after the first wouldpost_stream_end(index)after the end of the stream
Then I think we should just record this as part of the step_end_log or something, or have an optional step_profile log that we can render.
See #331 -- has this + a lot more