openllmetry-js
openllmetry-js copied to clipboard
Telemetry for time to first token when streaming
TTFT is a super key user experience metric I'd like to monitor across the various LLM providers I use. It'd be great to have a uniform way of measuring TTFT using all these instrumentations. I can see a couple ways of doing it:
- emitting a span event when the first token is received
- measuring the time until the first token and setting a new
genai.time_to_first_tokenattribute (or similar) on the span.
This measurement wouldn't really apply to non-streaming use cases, but for streams that take many 10s of seconds, my janky version of it has proven really useful for showing what is TTFT vs actual streaming time in my own use case. See a WIP implementation here: https://github.com/traceloop/openllmetry-js/commit/53b6bb4097fb44e081894cd69ea42fe8d8a08772