agents icon indicating copy to clipboard operation
agents copied to clipboard

How to measure latency properly?

Open marctorsoc opened this issue 3 weeks ago • 5 comments

Feature Type

Nice to have

Feature Description

I'd like to improve the docs for measuring latency here and for my own sake.

In theory the formula is:

total_latency = eou.end_of_utterance_delay + llm.ttft + tts.ttfb

but from my measurements it doesn't not predict the amount of silence from the person talking to the agent responding.

I downloaded a recording from a conversation and measured in two types of turns: when calling a tool and when generating a response.

Notes about user response makes agent call a tool

This is a bit of a special case, but I have many tools that have scripted messages parametrized with the tool params so a parametrized session.say. This means that the first token from the LLM cannot be sent to the TTS since we have to finish the tool call text to call the tool, then to send the message in the say to TTS. I'm happy to be wrong, but that's how I understand this 😅

I also read somewhere that there's a SentenceTokenizer and the first token is not sent to the TTS but they are sent by chunks. So not sure if the formula above reflects this, and if it's possible to derive a closed-form expression given this mechanism.

Measurements

Calling a tool

  • total measured audacity = 3.3 secs
  • EOU delay: langfuse=1.347
  • LLM TTFT: langfuse=2.36
  • TTS TTFB: langfuse=0.28
  • total using formula = 3.98

Generate a response

  • total measured audacity = 2.6 secs
  • EOU delay: langfuse=1.00
  • LLM TTFT: langfuse=1.07
  • TTS TTFB: langfuse=0.25
  • total using formula = 2.33

Am I doing something wrong? Could you give me some hints on how to measure effectively? I want to understand what are the bottlenecks for the agent not responding quicker. My understanding is that from the user talking to the agent talking (that's what I call total measured audacity) should be the same as the total using formula.

Thanks in advance for any contribution / observations :)

Workarounds / Alternatives

No response

Additional Context

No response

marctorsoc avatar Nov 07 '25 17:11 marctorsoc