webrtc-stats icon indicating copy to clipboard operation
webrtc-stats copied to clipboard

Request to add neteq waiting time to stats report

Open wssjwl opened this issue 6 years ago • 19 comments

Stats to be added

max_waiting_time_ms mean_waiting_time_ms

Justification

  • These are useful to understand how network was operating. Existing metrics such as plc/cng/plccng/etc do not provide enough insight that waiting time provides.
  • The metrics have already been calculated and have been collected by acm_receiver. However, it was dropped by audio_receive_stream. Hence, it is not being used for any purpose. There are only a few steps away from making it available in the stats report.

wssjwl avatar Apr 09 '19 17:04 wssjwl

What are the definition of these metrics?

henbos avatar Apr 10 '19 09:04 henbos

max_waiting_time_ms (maxWaitingTimeMs): max packet waiting time in the jitter buffer (ms) mean_waiting_time_ms (meanWaitingTimeMs): mean packet waiting time in the jitter buffer (ms)

They are defined in modules/audio_coding/include/audio_coding_module_typedefs.h

wssjwl avatar Apr 11 '19 21:04 wssjwl

The jitterBufferDelay (also exists for video) is the sum of "each sample [or frame] takes from the time it is received and to the time it exits the jitter buffer takes from the time it is received and to the time it exits the jitter buffer". You can already calculate the mean, for any desired time interval that you choose, by polling it twice and dividing the difference with the difference in jitterBufferEmittedCount.

What doesn't exist is the maximum. But do you need the maximum, given that you can poll this?

henbos avatar Apr 12 '19 10:04 henbos

In Chrome I believe this is already implemented for audio. This issue is for implementation status of the equivalent video metrics: https://bugs.chromium.org/p/webrtc/issues/detail?id=10450

henbos avatar Apr 12 '19 10:04 henbos

And max: Would this max for the duration of the entire call, or a "recent max"? "Recent" values are a bit iffy, but "for the duration of the entire call" metrics are also a bit iffy since this seem to assume you're not interested in long-running services.

henbos avatar Apr 12 '19 10:04 henbos

Yeah, it is possible to poll jitterBufferDelay. The downside is computational cost involved. Given the current implementation of RTCStatsReport, I have to poll the entire report each time just to compute one or two values. Whereas bringing those two variables into report can help to avoid all the hassle. An alternative option is to have a few light-weight polling method focusing on smaller set of values. It doesn't exist today. I am doubtful if it matches webrtc roadmap and strategy as well.

BTW, is jitterBufferDelay an average value or an instantaneous value? Does it include time consumed in computing PLC, CNG, PLCCNG, etc?

Max is very useful to understand how badly the service has experienced. If it deviates from mean much, there is indication that there may be severe voice quality degradation in addition round trip latency effect. It is desired to be a short term moving value. It gives information on user experience over time. Right now, it is implemented as max over an entire call. It can be very useful when detailed call logs are not available. One can still use a snapshot of it at end of the call to infer what had happened.

wssjwl avatar Apr 12 '19 16:04 wssjwl

I agree that you don't want to poll getStats() too often, I usually recommend once every second or every few seconds.

jitterBufferDelay is a sum of all delays of all samples, which means you can calculate the average for any interval you desire by dividing the difference between two snapshots and divide by the time difference. For example, if you poll getStats() once every second, you can calculate what the average delay was for all samples that went through the jitter buffer during that one second interval. This value should be stable enough that you can remember what the maximum value was.

I don't think the per-sample jitter buffer delay fluctuates to such extremes that you have to poll getStats() unreasonably often. Does that make sense?

I don't think jitterBufferDelay includes decoding delay, but I'm not sure.

henbos avatar Apr 12 '19 17:04 henbos

An alternative option is to have a few light-weight polling method focusing on smaller set of values. It doesn't exist today. I am doubtful if it matches webrtc roadmap and strategy as well.

You're right, it's not on the roadmap. The idea has been discussed before, but the API and codebase is is not friendly for per-metric polling. I don't see this happening, unfortunately.

If we feel the need to poll something more than roughly once per second then it probably should belong to a different API than getStats().

henbos avatar Apr 12 '19 17:04 henbos

(This is why audioLevel was added to getSynchronizationSources().)

henbos avatar Apr 12 '19 17:04 henbos

Submitter input needed: is max really needed, given that you can poll this every second?

henbos avatar Apr 17 '19 13:04 henbos

Yes. It is needed. Polling essentially gives mean only.

wssjwl avatar Apr 17 '19 16:04 wssjwl

OK thanks

henbos avatar Apr 17 '19 16:04 henbos

Thank you for your patience. Please let me know your decision and plan.

wssjwl avatar Apr 17 '19 18:04 wssjwl

what was the final proposal?

  • total wait time (with number of packet/samples will give mean, this probably exists)
  • max wait time

If yes, who is producing the PR?

vr000m avatar May 06 '19 21:05 vr000m

meanWaitingTimeMs and MaxWaitingTimeMs. The data has been collected in acm_receiver.cc.

You can find them in acm_receiver.cc. Copied their reference below.

acm_stat->meanWaitingTimeMs = neteq_stat.mean_waiting_time_ms; acm_stat->maxWaitingTimeMs = neteq_stat.max_waiting_time_ms;

wssjwl avatar May 06 '19 23:05 wssjwl

jitterBufferDelay gives the total and when divided by number of framesReceived gives the mean waiting time, probably measure in seconds. So that is done.

@wssjwl @henbos could you create a PR for jitterBufferMaxDelay (measured in seconds for consistency with other JB metrics)?

vr000m avatar May 06 '19 23:05 vr000m

I'm not convinced that max is really useful for diagnosing stuff. If you have a max, all you know is that at some point in the lifetime of the call, this max was recorded. What problem would this help you diagnose?

alvestrand avatar May 15 '19 16:05 alvestrand

Unless we hear a compelling reason to include this value, we will close the issue.

alvestrand avatar May 15 '19 16:05 alvestrand

Added the icebox label based on no activity and editors not having heard a compelling use case.

henbos avatar Aug 29 '19 16:08 henbos