roc-toolkit icon indicating copy to clipboard operation
roc-toolkit copied to clipboard

Dynamic latency adjustment

Open gavv opened this issue 1 year ago • 23 comments

Implementation

  • [x] in LinkMeter, compute jitter, total_packets, lost_packets
  • [x] teach fec::Reader and fec::Writer to compute FEC block duration
  • [x] teach LatencyMonitor to compute target latency continuously based on min latency, max latency, FEC block duration, and jitter, and pass updated target to FreqEstimator
  • [x] update pipeline to support dynamic latency mode, enabled when target latency is zero (currently some components may be not ready for this)
  • [x] make this mode default in CLI and API (in CLI, enable it via --target-latency=auto, in API via target_latency=0)
  • [x] add new metrics to API: jitter, total_packets, lost_packets (see #681)

Fixes

  • [x] RTCP crash
  • [x] seqnum crash
  • [x] incorrect calculated FEC block duration
  • [ ] rework defaults deduction (min/max/target)
  • [ ] add margin for auto-tuning min/max relative to hard min/max
  • [x] use CsvDumper
  • [ ] use LogNote

Tests

  • [x] LinkMeter tests

  • [ ] pipeline tests: test_receiver_source

    https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_receiver_source.cpp#L2296-L2305

    • [x] metrics_packet_counters (simulate losses and reorders and check total_packets, lost_packets, ext_first_seqnum, ext_last_seqnum)
    • [ ] metrics_jitter (simulate timestamp jitter and check jitter metric)
  • [ ] pipeline tests: test_loopback_sink_2_source: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)

    https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L350-L353

    https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L381-L384

  • [ ] public_api tests: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)

Docs

  • [ ] update API comments
  • [ ] update CLI manual pages

gavv avatar Feb 03 '24 10:02 gavv

@gavv What is the best way you see to convert FEC block size into nanoseconds?

baranovmv avatar Feb 04 '24 18:02 baranovmv

add component that continuously computes target latency based on min latency, max latency, FEC block size, and jitter

What interface of this component do you presume would fit our needs here?

I see it as a single function class like this:

class LatencyRuller {
public:
    LatencyRuller() {}

    ns get_target_latency (ns min_latency, ns max_latency, ns fec_block_duration, ns avg_jitter) const {
        ns res = avg_jitter * 3;
        if (res < fec_block_duration) {
                res = fec_block_duration;
        }
        if (res < min_latency) {
                res = min_latency;
        }
        if (res > max_latency) {
                res = max_latency;
        }
        return res;
    }
};

Am I missing something?

  • min/max latencies and FEC block size -- API parameters, so they are constant values;
  • jitter -- is not supposed to change fast/often, its variability or speed of change is parameterized by win_len ctor argument of LinkMeter

baranovmv avatar Feb 04 '24 18:02 baranovmv

@gavv What is the best way you see to convert FEC block size into nanoseconds?

Since both fec block and packet length are variable, I think we should compute fec block duration on fly.

We can teach fec::Reader to do it. In the end of each block, we can compute its duration.

However, this duration may be varying even for fixed block size, depending on encoding and packetization. We could compute maximum ever seen duration after last block size change. Or we could compute a moving maximum of last N blocks.

Anyway, we'll need to add a pointer to fec::Reader to LatencyMonitor, and if it's non-null, LatencyMonitor should query block duration and pass it to LatencyTuner via LatencyMetrics.

Also we'll need to add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration). I think we'll need packet duration during block duration computation.

Since we also need to be able to tune latency on sender side, we also need to teach fec::Writer to compute FEC block duration; and teach FeedbackMonitor to write fec block duration into LatencyMetrics passed to LatencyTuner.

gavv avatar Feb 05 '24 09:02 gavv

I see it as a single function class like this:

Then I think we can inline it into LatencyTuner. I thought it would have more complicated logic.

gavv avatar Feb 05 '24 09:02 gavv

I've updated task description.

gavv avatar Feb 05 '24 09:02 gavv

add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration)

Am I right you imply getting difference between values of sum position() + available() on every new fec block?

Here's how we do it currently:

void Filter::populate_(const packet::PacketPtr& packet) {
    if (packet->rtp()->duration == 0) {
        packet->rtp()->duration =
            (packet::stream_timestamp_t)decoder_.decoded_sample_count(
                packet->rtp()->payload.data(), packet->rtp()->payload.size());
    }
}

gavv avatar Feb 05 '24 18:02 gavv

Wi-Fi + speedtest (no FEC, start from 100ms)

@baranovmv Testing your branch on my home wifi, so far works great!

Plotting using csvplotter.

Streaming from RPi4 to laptop over 5ghz. Running a speed test in background (in web browser) to cause increased jitter.

Full logs: 20240401_dynamic_latency_speedtest.zip

grep 'watchdog: status' roc-recv.log

15:12:48.141 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.160 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.181 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.203 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.213 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.235 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwb........
15:12:53.226 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:12:53.248 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.259 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.280 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbB
15:12:53.291 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.312 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.333 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.355 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.376 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.387 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.408 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:13:04.298 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .............ibbbbbb
15:13:04.299 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:13:04.320 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:14:43.360 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...................i
15:14:43.381 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.402 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.424 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.445 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbD.............
15:14:55.743 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:14:55.765 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbi..............

(w = warmup, b = blank, i = incomplete, D = drops of late packets)

  • There were drops at: 15:12:53, 15:13:04, 15:14:43, 15:14:55.

  • 15:12:53 is the very beginning. After seeing higher jitter than the initial latency (100ms here), latency tuner increased latency and drops quickly disappeared.

  • From ~ 15:13:20 to 15:14:50 I see in logs that jitter was gradually increasing from 155ms to 305ms, because I started the speed test. At 15:13:04 there were drops caused by it. At 15:14:43 latency tuner reacted to increased jitter and set target latency to 472ms.

  • There were a bit more drops at 15:14:43 and 15:14:55, and then they disappeared.

Start streaming

Latency tuner: increasing target latency 4800(100.000ms) → 11561(240.852ms)

Start speed test in background

Latency tuner: increasing target latency 11561(240.854ms) → 22692(472.760ms)

Stop speed test

Latency monitor: decreasing target latency 22692(472.750ms) → 18020(375.419ms)
Latency monitor: decreasing target latency 18020(375.417ms) → 14310(298.125ms)

gavv avatar Apr 01 '24 11:04 gavv

I can confirm that time of spikes in gping roughtly correspond to time of spikes in jitter in csvplotter.

gavv avatar Apr 01 '24 12:04 gavv

Ethernet (no FEC, start from 40ms)

Here's another test. RPi is streaming to laptop connected to laptop via dock station (no hubs/switches/routers).

Full logs: 20240401_ethernet.zip

Initial latency was 40ms, then it was slowly (during 10 minutes) adjusted as: 40 -> 31 -> 25 -> 53 -> 42 -> 33 ->26.

Jitter is pretty stable but has rare spikes that caused those target latency adjustments.

There were no drops at all.

I'd say that something like 25ms is indeed very suitable target latency for this link. If we'd be able to filter out jitter spikes from calculations, I guess tuner would just stick to it.

rg -N → 
roc-recv.log
18:01:04.921 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1764(40.000ms) → 1401(31.765ms)
18:03:14.147 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1401(31.769ms) → 1113(25.228ms)
18:03:29.155 [1976622] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 1113(25.238ms) → 2370(53.741ms)
18:05:57.613 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 2370(53.741ms) → 1882(42.677ms)
18:06:26.648 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1882(42.676ms) → 1495(33.890ms)
18:06:54.711 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1495(33.900ms) → 1187(26.921ms)

gavv avatar Apr 01 '24 14:04 gavv

fe_stable

Here is fe_stable from previous measurement. @baranovmv is it expected?

rg -IN fe_stable  | ch 0 -1   
18:00:14.810 fe_stable=false
18:00:19.802 fe_stable=false
18:00:24.804 fe_stable=false
18:00:29.806 fe_stable=false
18:00:34.809 fe_stable=false
18:00:39.802 fe_stable=false
18:00:44.804 fe_stable=false
18:00:49.807 fe_stable=false
18:00:54.809 fe_stable=false
18:00:59.802 fe_stable=false
18:01:04.804 fe_stable=false
18:01:09.807 fe_stable=false
18:01:14.809 fe_stable=false
18:01:19.802 fe_stable=false
18:01:24.804 fe_stable=false
18:01:29.807 fe_stable=false
18:01:34.809 fe_stable=true
18:01:39.801 fe_stable=true
18:01:44.804 fe_stable=true
18:01:49.806 fe_stable=true
18:01:54.809 fe_stable=true
18:01:59.801 fe_stable=true
18:02:04.803 fe_stable=true
18:02:09.806 fe_stable=true
18:02:14.809 fe_stable=true
18:02:19.801 fe_stable=true
18:02:24.804 fe_stable=true
18:02:29.807 fe_stable=false
18:02:34.809 fe_stable=false
18:02:39.801 fe_stable=false
18:02:44.803 fe_stable=true
18:02:49.806 fe_stable=false
18:02:54.809 fe_stable=false
18:02:59.801 fe_stable=false
18:03:04.803 fe_stable=true
18:03:09.806 fe_stable=true
18:03:14.809 fe_stable=false
18:03:19.801 fe_stable=false
18:03:24.803 fe_stable=false
18:03:29.806 fe_stable=false
18:03:34.809 fe_stable=false
18:03:39.802 fe_stable=false
18:03:44.804 fe_stable=false
18:03:49.806 fe_stable=false
18:03:54.809 fe_stable=false
18:03:59.801 fe_stable=false
18:04:04.803 fe_stable=false
18:04:09.806 fe_stable=false
18:04:14.808 fe_stable=true
18:04:19.800 fe_stable=true
18:04:24.803 fe_stable=true
18:04:29.805 fe_stable=true
18:04:34.808 fe_stable=true
18:04:39.801 fe_stable=true
18:04:44.803 fe_stable=true
18:04:49.805 fe_stable=true
18:04:54.808 fe_stable=true
18:04:59.801 fe_stable=true
18:05:04.803 fe_stable=true
18:05:09.805 fe_stable=true
18:05:14.808 fe_stable=true
18:05:19.800 fe_stable=true
18:05:24.803 fe_stable=true
18:05:29.805 fe_stable=true
18:05:34.808 fe_stable=true
18:05:39.800 fe_stable=true
18:05:44.803 fe_stable=true
18:05:49.805 fe_stable=true
18:05:54.808 fe_stable=true
18:05:59.800 fe_stable=false
18:06:04.803 fe_stable=false
18:06:09.805 fe_stable=false
18:06:14.808 fe_stable=false
18:06:19.801 fe_stable=false
18:06:24.803 fe_stable=false
18:06:29.805 fe_stable=false
18:06:34.808 fe_stable=false
18:06:39.800 fe_stable=false
18:06:44.803 fe_stable=false
18:06:49.805 fe_stable=false
18:06:54.807 fe_stable=true
18:06:59.800 fe_stable=false
18:07:04.802 fe_stable=false
18:07:09.805 fe_stable=false
18:07:14.808 fe_stable=false
18:07:19.800 fe_stable=false
18:07:24.802 fe_stable=true
18:07:29.805 fe_stable=true
18:07:34.807 fe_stable=true
18:07:39.799 fe_stable=true
18:07:44.802 fe_stable=true
18:07:49.804 fe_stable=true
18:07:54.807 fe_stable=true
18:07:59.799 fe_stable=true
18:08:04.802 fe_stable=true
18:08:09.804 fe_stable=true
18:08:14.807 fe_stable=false
18:08:19.800 fe_stable=false
18:08:24.802 fe_stable=false
18:08:29.804 fe_stable=true
18:08:34.807 fe_stable=true
18:08:39.799 fe_stable=false
18:08:44.802 fe_stable=false
18:08:49.805 fe_stable=false
18:08:54.807 fe_stable=true
18:08:59.799 fe_stable=true
18:09:04.801 fe_stable=true
18:09:09.804 fe_stable=false
18:09:14.807 fe_stable=false
18:09:19.799 fe_stable=false
18:09:24.801 fe_stable=true
18:09:29.804 fe_stable=true
18:09:34.807 fe_stable=false
18:09:39.799 fe_stable=false
18:09:44.802 fe_stable=false
18:09:49.804 fe_stable=true
18:09:54.807 fe_stable=true
18:09:59.798 fe_stable=true
18:10:04.801 fe_stable=false
18:10:09.804 fe_stable=false
18:10:14.807 fe_stable=false
18:10:19.798 fe_stable=true
18:10:24.801 fe_stable=false
18:10:29.804 fe_stable=false
18:10:34.807 fe_stable=false
18:10:39.798 fe_stable=true

gavv avatar Apr 01 '24 14:04 gavv

RTCP crash

When I enable RTCP, roc-send crashes when it gots first feedback report from receiver:

./roc/roc-send -vv -s rtp+rs8m://192.168.0.101:20001 -r rs8m://192.168.0.101:20002 \
    -c rtcp://192.168.0.101:20003 -i file:./long.wav
roc-recv -vv -s rtp+rs8m://0.0.0.0:20001 -r rs8m://0.0.0.0:20002 \
    -c rtcp://0.0.0.0:20003 \
    --target-latency=200ms --min-latency=5ms --max-latency=1000ms --beep

Crash:

18:52:23.180 [2408] [dbg] roc_sndio: [backend_map.cpp:20] backend map: initializing: n_backends=2 n_drivers=58
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=packet_pool object_size=496 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=992 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=4136 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=encoding_pool object_size=240 min_slab=3840B(16S) max_slab=0B(0S)
18:52:23.181 [2409] [dbg] roc_netio: [network_loop.cpp:278] network loop: starting event loop
18:52:23.182 [2410] [dbg] roc_ctl: [control_task_queue.cpp:95] control task queue: starting event loop
18:52:23.182 [2408] [dbg] roc_node: [context.cpp:24] context: initializing
18:52:23.182 [2408] [inf] roc_sndio: [wav_source.cpp:177] wav source: opened input file: path=./long.wav in_bits=16 in_rate=44100 in_ch=2
18:52:23.182 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=slot_pool object_size=1072 min_slab=8B(1S) max_slab=0B(0S)
18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:32] sender node: initializing
18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiosrc interface of slot 0
18:52:23.182 [2408] [inf] roc_pipeline: [sender_sink.cpp:72] sender sink: adding slot
18:52:23.183 [2408] [dbg] roc_rtp: [identity.cpp:50] rtp identity: ssrc=1540280261 cname=853129e1-8270-4cf8-aa47-bc0d202891db
18:52:23.183 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiosrc interface of slot 0 to rtp+rs8m://192.168.0.101:20001
18:52:23.183 [2409] [dbg] roc_netio: [udp_port.cpp:129] udp port: : opened port
18:52:23.184 [2408] [inf] roc_node: [sender.cpp:201] sender node: bound audiosrc interface to 0.0.0.0:46121
18:52:23.184 [2409] [dbg] roc_netio: [network_loop.cpp:427] network loop: starting sending packets on port 
18:52:23.184 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiosrc endpoint rtp+rs8m
18:52:23.184 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiorpr interface of slot 0
18:52:23.184 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiorpr interface of slot 0 to rs8m://192.168.0.101:20002
18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audiorpr interface
18:52:23.185 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiorpr endpoint rs8m
18:52:23.185 [2408] [dbg] roc_fec: [openfec_encoder.cpp:28] openfec encoder: initializing: codec=rs m=8
18:52:23.185 [2408] [dbg] roc_fec: [writer.cpp:91] fec writer: update block size: cur_sbl=0 cur_rbl=0 new_sbl=18 new_rbl=10
18:52:23.185 [2408] [dbg] roc_audio: [packetizer.cpp:60] packetizer: initializing: packet_length=5.000ms samples_per_packet=221 payload_size=884 sample_spec= chset=>
18:52:23.185 [2408] [dbg] roc_audio: [latency_tuner.cpp:199] latency tuner: initializing: target_latency=0(0.000ms) min_latency=0(0.000ms) max_latency=0(0.000ms) latency_upper_limit_coef=1.700000 stale_tolerance=0(0.000ms) scaling_interval=0(0.000ms) scaling_tolerance=0.000000 backend=niq profile=intact
18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audioctl interface of slot 0
18:52:23.185 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audioctl interface of slot 0 to rtcp://192.168.0.101:20003
18:52:23.186 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audioctl interface
18:52:23.186 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audioctl endpoint rtcp
18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=stream_pool object_size=856 min_slab=6848B(8S) max_slab=0B(0S)
18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=address_pool object_size=200 min_slab=800B(4S) max_slab=0B(0S)
18:52:23.186 [2408] [dbg] roc_rtcp: [reporter.cpp:69] rtcp reporter: initializing: local_ssrc=1540280261 local_cname="853129e1-8270-4cf8-aa47-bc0d202891db" report_mode=address report_addr=192.168.0.101:20003 timeout=5000.000ms
18:52:23.186 [2408] [dbg] roc_audio: [feedback_monitor.cpp:66] feedback monitor: start gathering feedback
18:52:23.186 [2409] [dbg] roc_netio: [network_loop.cpp:447] network loop: starting receiving packets on port 
18:52:23.187 [2408] [dbg] roc_sndio: [pump.cpp:55] pump: starting main loop
18:52:23.187 [2408] [dbg] roc_packet: [router.cpp:103] router: detected new stream: source_id=1540280261 route_flags=[audio] packet_flags=[rtp,fec,audio,prepared,composed]
18:52:23.187 [2408] [dbg] roc_netio: [udp_port.cpp:579] udp port: : recv=0 send=1 send_nb=1
18:52:23.187 [2408] [dbg] roc_rtp: [timestamp_extractor.cpp:82] timestamp extractor: returning mapping: cts:1711983143187832242/sts:1946609469
18:52:23.187 [2408] [dbg] roc_rtcp: [reporter.cpp:1452] rtcp reporter: creating address: remote_addr=192.168.0.101:20003
18:52:23.188 [2408] [dbg] roc_rtcp: [reporter.cpp:1172] rtcp reporter: completed index rebuild: n_streams=0 n_addrs=1
18:52:23.188 [2408] [dbg] roc_rtcp: [communicator.cpp:855] rtcp communicator: generated_pkts=1 processed_pkts=0 proc_errs=0
18:52:23.188 [2408] [dbg] roc_pipeline: [pipeline_loop.cpp:483] pipeline loop: tasks=5 in_place=1.00 in_frame=0.00 preempts=0 sched=0/0
18:52:23.277 [2408] [dbg] roc_packet: [router.cpp:121] router: detected new stream: source_id=none route_flags=[repair] packet_flags=[fec,repair,prepared,composed]
18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:1399] rtcp reporter: creating stream: ssrc=1469140577
18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:310] rtcp reporter: updating stream address: ssrc=1469140577 old_addr= new_addr=192.168.0.101:20003
18:52:25.197 [2408] [inf] roc_audio: [feedback_monitor.cpp:81] feedback monitor: got first report from receiver: source=1469140577

src/internal_modules/roc_core/optional.h:58: error: roc_panic()

ERROR: roc_audio: optional: attempt to dereference unitialized object

Backtrace: #1: 0x134D25 roc::core::print_backtrace_full()+0x5C #2: 0x133905 roc::core::die_gracefully(char const*, bool)+0x50 #3: 0x12DE85 roc::core::panic(char const*, char const*, int, char const*, ...)+0xCC #4: 0xC2C13 roc::core::Optional<roc::audio::FreqEstimator, 4184u>::operator->() const+0x22 #5: 0xC1FDB roc::audio::LatencyTuner::report_()+0x19A #6: 0xC1B03 roc::audio::LatencyTuner::advance_stream(unsigned int)+0x7A #7: 0xC02A5 roc::audio::FeedbackMonitor::update_tuner_(unsigned int)+0x134 #8: 0xC0013 roc::audio::FeedbackMonitor::write(roc::audio::Frame&)+0x52 #9: 0xBF99F roc::audio::Fanout::write(roc::audio::Frame&)+0x2A #10: 0x7F003 roc::pipeline::SenderSink::write(roc::audio::Frame&)+0x4C #11: 0x7DFA9 roc::pipeline::SenderLoop::process_subframe_imp(roc::audio::Frame&)+0x18 #12: 0x7CB3F roc::pipeline::PipelineLoop::process_next_subframe_(roc::audio::Frame&, unsigned int*, unsigned int)+0xFC #13: 0x7C6B1 roc::pipeline::PipelineLoop::process_subframes_and_tasks_precise_(roc::audio::Frame&)+0x98 #14: 0x7C583 roc::pipeline::PipelineLoop::process_subframes_and_tasks(roc::audio::Frame&)+0x1A #15: 0x7DF07 roc::pipeline::SenderLoop::write(roc::audio::Frame&)+0x146 #16: 0x88FBD roc::sndio::Pump::transfer_frame_(roc::sndio::ISource&)+0x178 #17: 0x88D11 roc::sndio::Pump::run()+0x2AE #18: 0x6FDEB main+0x1FF6 #19: 0xB6AE8740 __libc_start_main+0x114 #20: 0x6DD55 _start+0x20 Aborted

gavv avatar Apr 01 '24 14:04 gavv

Wi-Fi + noisy link (FEC, standard conservative settings)

Full logs: 20240401_wifi_sane_defaults.zip

Settings:

  • Reed-Solomon FEC with default block size
  • --target-latency=200ms
  • --min-latency=50ms
  • --max-latency=1s

(I think these values can be default in roc-recv).

There were no speedtest, however the link itself was noisy since it's evening and everybody around use Wi-Fi / Bluetooth.

rg -IN → | ch 5:
Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)

So we started from 200ms, jumped to 306ms, then after a big jitter spike to 621ms, then gradually rolled back to 311ms (during 12 minutes).

Drops happened only once during that jitter spike. FEC partially recovered those drops.

rg -IN 'watchdog|repaired|→' | perl -ne 'print if /18:57:00/../19:03:43/'
18:57:00.531 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
18:57:00.532 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibb...
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...ibbbbbbbbbbbbbbbb
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.681 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.702 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.723 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.745 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
18:57:44.745 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbD..
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/14/255 ....XXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/18/255 XXXXXXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 2/2/255 rr................ .........
18:57:44.862 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .................ibb
18:57:44.883 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.905 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.926 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.947 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.968 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.980 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.990 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.011 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.032 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbi.....
19:00:14.856 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
19:01:09.693 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
19:03:43.729 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)

gavv avatar Apr 01 '24 15:04 gavv

Wi-Fi, unstable channel

Streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable.

I'm seeing regular repairs and drops (and crackling), but don't see any reaction from latency tuner.

Sample 1

pulse.log

Sample 2

pulse2.log

Sample 3

pulse3.log

gavv avatar Apr 02 '24 09:04 gavv

After a closer look, it seems the last one is really caused by packet loss, as discussed in chat.

So it's not a problem of latency adjustment algorithm, it behaves correctly here.

20240402_poor_link_2.zip

From wireshark:

5.61% of 314sec = 17.6 secs of lost audio

From our log:

rg -NI 'watchdog: status' | ch -1 | tr -d '\n' | grep -o 'b' | wc -l
67308

67308 blank frames * 8 samples per frame / 44100Hz = 12.21 secs of lost audio

(8 samples used by speex resampler)

Also from our log:

rg -NI cum_loss | ch 13 | tail -1
cum_loss=2301

Wireshark report and our report are close in number and duration of lost packets.

BTW reported max jitter is quite different: 100ms vs 600ms.

gavv avatar Apr 02 '24 18:04 gavv

The problem can be solved by using larger FEC blocks on sender, e.g. --nbsrc=30 --nbrpr=20.

gavv avatar Apr 02 '24 19:04 gavv

If I use --packet-len 7ms --nbsrc 40 --nbrpr 20 on sender, which gives FEC blocks of 280ms, receiver still uses target latency of 200ms for some reason.

(Jitter is below 200ms in that case).

gavv avatar Apr 03 '24 10:04 gavv

Wi-Fi, unstable channel

I did some more experiments in this setup (streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable).

Some observations:

  • All problems are caused by burst losses.

  • Here is the best configuration that I've found for this particular setup:

    • on sender: increase packet_len to 7ms
    • on sender: increase FEC block size to 60/20 (420ms)
    • on sender: switch from reed-solomon to LDPC-staircase
    • on receiver: increase min_latency to 500ms
  • Why:

    • when packets are smaller, there are more losses
    • when FEC block is smaller, burst losses cause loss of entire block(s)
    • when FEC block is larger, and there are losses, reed-solomon is too slow and causes glitches on receiver (device spec)
    • FEC works fine only when target latency is larger than FEC block, in this case higher by ~100ms works well

All this is expected, just want to document an actual experiment.

BTW even after doing all this I'm still not satisfied with that setup - sometimes burst losses are too long to be covered by a sane FEC block size. I wonder if RTX would help here (I don't know if in this periods ALL packets are dropped or just MANY of them. Also I don't know yet exact reason of losses - physical link or router queues).

Also I think compression would help here.

gavv avatar Apr 03 '24 11:04 gavv

During some short period, I was constantly getting this panic:

link meter: seqnum was not processed in writer part

I never restarted sender. I restarted receiver multiple times and got panic each time.

After a short time, panic disappears. And it starts reproducing again for a short time after a minute or two. And so on.

I suppose it's related to seqnum overflows.

First two files in zip are just logs. Second to files also has RTP headers, I added packet->print(0) to LinkMeter::read().

20240403_seqnum_crash.zip

gavv avatar Apr 03 '24 17:04 gavv

Here is an example when tuner was too optimistic, reduced latency, and it caused drops of late packets and crackling.

It happened after a long measurement (about 30 minutes). Attached logs have only the final part. See "dropping late packet" in the end of pulse.log.

20240403_too_optimistic.zip

gavv avatar Apr 03 '24 19:04 gavv

@gavv I'm about to enable roc-recv to determine if latency is fixed or it is dynamic.

We've got --target-latency command line parameter.

  • As you proposed, one possible option is to allow it to comprehend auto variant, but it wouldn't be possible to specify a starting latency, which would be default in this case. I'd like to keep it though
  • Another option is to add a bool parameter --dynamic-latency which should be set to true by default

baranovmv avatar May 06 '24 15:05 baranovmv

If we want to support specifying starting latency in CLI, then we also need it in C API?

There are two approaches for C API:

  • target_latency = 0 means adaptive latency, and we add a new field start_latency
  • we add a new flag / mode that enables adaptive latency; in this mode target_latency is used as starting latency

I think the second approach is a bit confusing, because target_latency suddenly transforms to starting latency in some cases.

If you agree, then in C API we can have target_latency and start_latency. In this case, I think in CLI we should have the same: --target-latency and --start-latency (because CLI mirrors C API).

If in future version we'll deduce starting latency automatically, this layout will work fine too. In this case start_latency = 0 would mean "deduce starting latency", and start_latency != 0 would mean "use fixed starting latency".

gavv avatar May 06 '24 16:05 gavv

tests: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)

We need #674 to see these metrics on sender

baranovmv avatar Jun 30 '24 14:06 baranovmv

Pipeline tests test_loopback_sink_2_source fail with FlagCTS: timestamp_mapping and timestamp_mapping_remixing

https://github.com/roc-streaming/roc-toolkit/blob/c327926dc0af72c78e15135ae6ac53af75f71fa6/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L399-L403

baranovmv avatar Jul 02 '24 21:07 baranovmv

Pipeline tests test_loopback_sink_2_source fail with FlagCTS: timestamp_mapping and timestamp_mapping_remixing

It's actually not related to CTS.

Currently you have:

if (flags & FlagRTCP) {
    ...
    if ((flags & FlagCTS) == 0) {
        <check total_packets>

If you look at the tests which set FlagCTS, you'll see:

TEST(loopback_sink_2_source, timestamp_mapping) {
    enum { Chans = Chans_Stereo, NumSess = 1 };

    send_receive(FlagRTCP | FlagCTS, NumSess, Chans, Chans);
}

TEST(loopback_sink_2_source, timestamp_mapping_remixing) {
    enum { FrameChans = Chans_Mono, PacketChans = Chans_Stereo, NumSess = 1 };

    send_receive(FlagRTCP | FlagCTS, NumSess, FrameChans, PacketChans);
}

FlagRTCP and FlagCTS are always set together, so your if effectively disables total_packets check in all tests, i.e. there are no tests where this check actually works.

The difference of 1 packet is actually expected. SenderParticipantMetrics.link.total_packets is not number of packets generated by sender. It is number of packets expected by receiver, as reported via RTCP feedback from receiver to sender.

(The whole SenderParticipantMetrics is based on RTCP feedback. It's more obvious for other fields, but is surprising for total_packets field. However, if we would set total_packet differently, it would cause inconsistency with other metrics).

When we call check_metrics(), the latest RTCP report from receiver to sender in not processed by sender yet, hence the difference.

I think in this test, since it is very high-level, we can consider it's an implementation detail (when exactly RTCP is processed) and allow 1 packet delta for the counters.

I suggest to replace this check:

UNSIGNED_LONGS_EQUAL(recv_party_metrics.link.lost_packets,
                     send_party_metrics.link.lost_packets);

with:

CHECK(send_party_metrics.link.total_packets) >= proxy.n_source() - 1 &&
      send_party_metrics.link.total_packets) <= proxy.n_source());
UNSIGNED_LONGS_EQUAL(proxy.n_source(),
                     recv_party_metrics.link.total_packets);

(It's better to check against PacketProxy::n_source())

gavv avatar Jul 07 '24 11:07 gavv

We need https://github.com/roc-streaming/roc-toolkit/issues/674 to see these metrics on sender

As I said above, total_packets and other link metrics that you see on sender are from RTCP feedback.

If you run test with -v -t flags:

roc-test-pipeline -v -t -sn timestamp_mapping

you'll see actual RTCP packets being delivered:

@ packet [0x5631436235f8] [udp,rtcp,control,prepared,composed]
 udp: src=<none> dst=127.0.0.1:44 rts=0
 rtcp: size=176
@ rtcp packet (176 bytes)
+ rr:
|- header:
|-- version: 2
|-- padding: 0
|-- counter: 1
|-- type: 201
|-- length: 32 bytes (7 words)
|- body:
|-- ssrc: 3996983558
|- block:
|-- ssrc: 3010344341
|-- fract_loss: 0.000000
|-- cum_loss: 0
|-- last_seqnum: 30389
|-- jitter: 36
|-- lsr: 0000c0c02e350000 (unix 2086027840180496215)
|-- dlsr: 00000000000f0000 (unix 228881)
+ xr:
|- header:
|-- version: 2
|-- padding: 0
|-- counter: 0
|-- type: 207
|-- length: 96 bytes (23 words)
|- body:
|-- ssrc: 3996983558
|- rrtr:
|-- block header:
|--- type: 4
|--- type_specific: 0x0
|--- length: 12 bytes (2 words)
|-- block body:
|--- ntp_timestamp: 83b9c0c029905011 (unix 1000000162358287)
|- measurement:
|-- block header:
|--- type: 14
|--- type_specific: 0x0
|--- length: 32 bytes (7 words)
|-- block body:
|--- ssrc: 3010344341
|--- first_sn: 30191
|--- interval_first_sn: 30389
|--- interval_last_sn: 30389
|--- interval_duration: 0000000000000000 (unix 0)
|--- cum_duration: 0000000000000000 (unix 0)
|- delay:
|-- block header:
|--- type: 16
|--- type_specific: 0x40
|--- length: 28 bytes (6 words)
|-- block body:
|--- flag: sample (1)
|--- ssrc: 3010344341
|--- rtt_mean: 00000000000f0000 (unix 228881)
|--- rtt_min: 00000000000f0000 (unix 228881)
|--- rtt_max: 00000000000f0000 (unix 228881)
|--- e2e_latency: 0000000019999999 (unix 99999999)
|- queue:
|-- block header:
|--- type: 220
|--- type_specific: 0x40
|--- length: 16 bytes (3 words)
|-- block body:
|--- flag: sample (1)
|--- ssrc: 3010344341
|--- niq_latency: 0000000004780000 (unix 17456054)
|--- niq_stalling: 0000000000000000 (unix 0)
+ sdes:
|- chunk:
|-- ssrc: 3996983558
|-- item:
|--- type: cname (1)
|--- text: 0425e8c6-21a6-47d3-807f-bba0a9b60742

Here you can see that for example jitter is non-zero.

I think in test_loopback_sink_2_source it's enough to check that:

  • jitter is non-zero
  • if FlagLosses is enabled, lost_packets becomes non-zero after a while
  • jitter, lost_packets, total_packets have close values on sender and receiver (but not strictly equal)

Since we plan to have good coverage for these metrics in test_receiver_source and test_sender_sink, in test_loopback_sink_2_source it's enough to do only a very basic check.

gavv avatar Jul 07 '24 11:07 gavv

    /** Total amount of packets sent or expected to be received.
     *
     *  On sender, this counter is just incremented every packet.
     *  On receiver, it is derived from seqnums.
     */

So this comment is wrong.

Maybe expected_packets would be a better name for this metrics?

With a comment like:

Total amount of packets that receiver expects to be delivered.

gavv avatar Jul 07 '24 11:07 gavv

QA: https://github.com/roc-streaming/qa/tree/main/manual/20241011_gh688_adaptive_latency

gavv avatar Aug 11 '24 19:08 gavv