burr icon indicating copy to clipboard operation
burr copied to clipboard

Capture TTFT with streaming

Open skrawcz opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. TTFT is not captured for streaming & async streaming.

Describe the solution you'd like We should capture TTFT if it's a streaming action and add it to latency view.

Describe alternatives you've considered N/A

Additional context This is a framework and UI change.

skrawcz avatar Aug 16 '24 23:08 skrawcz

Options:

  1. Encode it in a step's log -- capture:
    • time of start
    • time of first "token" (generator first yield)
    • time of last "token"
    • Number of tokens
  2. Encode it as an attribute

Either way, we'll need the following hooks:

  1. post_stream_start after we initialize the stream
  2. post_stream_step(index) after every yield -- would count, after the first would
  3. post_stream_end(index) after the end of the stream

Then I think we should just record this as part of the step_end_log or something, or have an optional step_profile log that we can render.

elijahbenizzy avatar Aug 17 '24 04:08 elijahbenizzy

See #331 -- has this + a lot more

elijahbenizzy avatar Aug 27 '24 04:08 elijahbenizzy