roc-toolkit
roc-toolkit copied to clipboard
Dynamic latency adjustment
Implementation
- [x] in LinkMeter, compute jitter, total_packets, lost_packets
- [x] teach fec::Reader and fec::Writer to compute FEC block duration
- [x] teach LatencyMonitor to compute target latency continuously based on min latency, max latency, FEC block duration, and jitter, and pass updated target to FreqEstimator
- [x] update pipeline to support dynamic latency mode, enabled when target latency is zero (currently some components may be not ready for this)
- [x] make this mode default in CLI and API (in CLI, enable it via --target-latency=auto, in API via target_latency=0)
- [x] add new metrics to API: jitter, total_packets, lost_packets (see #681)
Fixes
- [x] RTCP crash
- [x] seqnum crash
- [x] incorrect calculated FEC block duration
- [ ] rework defaults deduction (min/max/target)
- [ ] add margin for auto-tuning min/max relative to hard min/max
- [x] use CsvDumper
- [ ] use LogNote
Tests
-
[x] LinkMeter tests
-
[ ] pipeline tests:
test_receiver_source
https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_receiver_source.cpp#L2296-L2305
- [x]
metrics_packet_counters
(simulate losses and reorders and check total_packets, lost_packets, ext_first_seqnum, ext_last_seqnum) - [ ]
metrics_jitter
(simulate timestamp jitter and check jitter metric)
- [x]
-
[ ] pipeline tests:
test_loopback_sink_2_source
: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L350-L353
https://github.com/roc-streaming/roc-toolkit/blob/6eb0a5933c2419f5325862f845509b7150cf2fa0/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L381-L384
-
[ ]
public_api
tests: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)
Docs
- [ ] update API comments
- [ ] update CLI manual pages
@gavv What is the best way you see to convert FEC block size
into nanoseconds?
add component that continuously computes target latency based on min latency, max latency, FEC block size, and jitter
What interface of this component do you presume would fit our needs here?
I see it as a single function class like this:
class LatencyRuller {
public:
LatencyRuller() {}
ns get_target_latency (ns min_latency, ns max_latency, ns fec_block_duration, ns avg_jitter) const {
ns res = avg_jitter * 3;
if (res < fec_block_duration) {
res = fec_block_duration;
}
if (res < min_latency) {
res = min_latency;
}
if (res > max_latency) {
res = max_latency;
}
return res;
}
};
Am I missing something?
-
min/max latencies
andFEC block size
-- API parameters, so they are constant values; -
jitter
-- is not supposed to change fast/often, its variability or speed of change is parameterized bywin_len
ctor argument ofLinkMeter
@gavv What is the best way you see to convert FEC block size into nanoseconds?
Since both fec block and packet length are variable, I think we should compute fec block duration on fly.
We can teach fec::Reader to do it. In the end of each block, we can compute its duration.
However, this duration may be varying even for fixed block size, depending on encoding and packetization. We could compute maximum ever seen duration after last block size change. Or we could compute a moving maximum of last N blocks.
Anyway, we'll need to add a pointer to fec::Reader to LatencyMonitor, and if it's non-null, LatencyMonitor should query block duration and pass it to LatencyTuner via LatencyMetrics.
Also we'll need to add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration). I think we'll need packet duration during block duration computation.
Since we also need to be able to tune latency on sender side, we also need to teach fec::Writer to compute FEC block duration; and teach FeedbackMonitor to write fec block duration into LatencyMetrics passed to LatencyTuner.
I see it as a single function class like this:
Then I think we can inline it into LatencyTuner. I thought it would have more complicated logic.
I've updated task description.
add a reference to IFrameDecoder to fec::Reader and use it to fill duration of repaired packets (no full decoding, just query duration)
Am I right you imply getting difference between values of sum
position() + available()
on every new fec block?
Here's how we do it currently:
void Filter::populate_(const packet::PacketPtr& packet) {
if (packet->rtp()->duration == 0) {
packet->rtp()->duration =
(packet::stream_timestamp_t)decoder_.decoded_sample_count(
packet->rtp()->payload.data(), packet->rtp()->payload.size());
}
}
Wi-Fi + speedtest (no FEC, start from 100ms)
@baranovmv Testing your branch on my home wifi, so far works great!
Plotting using csvplotter
.
Streaming from RPi4 to laptop over 5ghz. Running a speed test in background (in web browser) to cause increased jitter.
Full logs: 20240401_dynamic_latency_speedtest.zip
grep 'watchdog: status' roc-recv.log
15:12:48.141 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.160 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.181 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.203 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.213 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwwwwwwwwww
15:12:48.235 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: wwwwwwwwwwwb........
15:12:53.226 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:12:53.248 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.259 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.280 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbB
15:12:53.291 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.312 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.333 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.355 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.376 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.387 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:12:53.408 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:13:04.298 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .............ibbbbbb
15:13:04.299 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:13:04.320 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bD..................
15:14:43.360 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...................i
15:14:43.381 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.402 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.424 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
15:14:43.445 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbD.............
15:14:55.743 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibbbbb
15:14:55.765 [1965003] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbi..............
(w
= warmup, b
= blank, i
= incomplete, D
= drops of late packets)
-
There were drops at:
15:12:53
,15:13:04
,15:14:43
,15:14:55
. -
15:12:53
is the very beginning. After seeing higher jitter than the initial latency (100ms here), latency tuner increased latency and drops quickly disappeared. -
From ~
15:13:20
to15:14:50
I see in logs that jitter was gradually increasing from 155ms to 305ms, because I started the speed test. At15:13:04
there were drops caused by it. At15:14:43
latency tuner reacted to increased jitter and set target latency to 472ms. -
There were a bit more drops at
15:14:43
and15:14:55
, and then they disappeared.
Start streaming
Latency tuner: increasing target latency 4800(100.000ms) → 11561(240.852ms)
Start speed test in background
Latency tuner: increasing target latency 11561(240.854ms) → 22692(472.760ms)
Stop speed test
Latency monitor: decreasing target latency 22692(472.750ms) → 18020(375.419ms)
Latency monitor: decreasing target latency 18020(375.417ms) → 14310(298.125ms)
I can confirm that time of spikes in gping
roughtly correspond to time of spikes in jitter in csvplotter.
Ethernet (no FEC, start from 40ms)
Here's another test. RPi is streaming to laptop connected to laptop via dock station (no hubs/switches/routers).
Full logs: 20240401_ethernet.zip
Initial latency was 40ms, then it was slowly (during 10 minutes) adjusted as: 40 -> 31 -> 25 -> 53 -> 42 -> 33 ->26.
Jitter is pretty stable but has rare spikes that caused those target latency adjustments.
There were no drops at all.
I'd say that something like 25ms is indeed very suitable target latency for this link. If we'd be able to filter out jitter spikes from calculations, I guess tuner would just stick to it.
rg -N →
roc-recv.log
18:01:04.921 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1764(40.000ms) → 1401(31.765ms)
18:03:14.147 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1401(31.769ms) → 1113(25.228ms)
18:03:29.155 [1976622] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 1113(25.238ms) → 2370(53.741ms)
18:05:57.613 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 2370(53.741ms) → 1882(42.677ms)
18:06:26.648 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1882(42.676ms) → 1495(33.890ms)
18:06:54.711 [1976622] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 1495(33.900ms) → 1187(26.921ms)
fe_stable
Here is fe_stable
from previous measurement. @baranovmv is it expected?
rg -IN fe_stable | ch 0 -1 18:00:14.810 fe_stable=false 18:00:19.802 fe_stable=false 18:00:24.804 fe_stable=false 18:00:29.806 fe_stable=false 18:00:34.809 fe_stable=false 18:00:39.802 fe_stable=false 18:00:44.804 fe_stable=false 18:00:49.807 fe_stable=false 18:00:54.809 fe_stable=false 18:00:59.802 fe_stable=false 18:01:04.804 fe_stable=false 18:01:09.807 fe_stable=false 18:01:14.809 fe_stable=false 18:01:19.802 fe_stable=false 18:01:24.804 fe_stable=false 18:01:29.807 fe_stable=false 18:01:34.809 fe_stable=true 18:01:39.801 fe_stable=true 18:01:44.804 fe_stable=true 18:01:49.806 fe_stable=true 18:01:54.809 fe_stable=true 18:01:59.801 fe_stable=true 18:02:04.803 fe_stable=true 18:02:09.806 fe_stable=true 18:02:14.809 fe_stable=true 18:02:19.801 fe_stable=true 18:02:24.804 fe_stable=true 18:02:29.807 fe_stable=false 18:02:34.809 fe_stable=false 18:02:39.801 fe_stable=false 18:02:44.803 fe_stable=true 18:02:49.806 fe_stable=false 18:02:54.809 fe_stable=false 18:02:59.801 fe_stable=false 18:03:04.803 fe_stable=true 18:03:09.806 fe_stable=true 18:03:14.809 fe_stable=false 18:03:19.801 fe_stable=false 18:03:24.803 fe_stable=false 18:03:29.806 fe_stable=false 18:03:34.809 fe_stable=false 18:03:39.802 fe_stable=false 18:03:44.804 fe_stable=false 18:03:49.806 fe_stable=false 18:03:54.809 fe_stable=false 18:03:59.801 fe_stable=false 18:04:04.803 fe_stable=false 18:04:09.806 fe_stable=false 18:04:14.808 fe_stable=true 18:04:19.800 fe_stable=true 18:04:24.803 fe_stable=true 18:04:29.805 fe_stable=true 18:04:34.808 fe_stable=true 18:04:39.801 fe_stable=true 18:04:44.803 fe_stable=true 18:04:49.805 fe_stable=true 18:04:54.808 fe_stable=true 18:04:59.801 fe_stable=true 18:05:04.803 fe_stable=true 18:05:09.805 fe_stable=true 18:05:14.808 fe_stable=true 18:05:19.800 fe_stable=true 18:05:24.803 fe_stable=true 18:05:29.805 fe_stable=true 18:05:34.808 fe_stable=true 18:05:39.800 fe_stable=true 18:05:44.803 fe_stable=true 18:05:49.805 fe_stable=true 18:05:54.808 fe_stable=true 18:05:59.800 fe_stable=false 18:06:04.803 fe_stable=false 18:06:09.805 fe_stable=false 18:06:14.808 fe_stable=false 18:06:19.801 fe_stable=false 18:06:24.803 fe_stable=false 18:06:29.805 fe_stable=false 18:06:34.808 fe_stable=false 18:06:39.800 fe_stable=false 18:06:44.803 fe_stable=false 18:06:49.805 fe_stable=false 18:06:54.807 fe_stable=true 18:06:59.800 fe_stable=false 18:07:04.802 fe_stable=false 18:07:09.805 fe_stable=false 18:07:14.808 fe_stable=false 18:07:19.800 fe_stable=false 18:07:24.802 fe_stable=true 18:07:29.805 fe_stable=true 18:07:34.807 fe_stable=true 18:07:39.799 fe_stable=true 18:07:44.802 fe_stable=true 18:07:49.804 fe_stable=true 18:07:54.807 fe_stable=true 18:07:59.799 fe_stable=true 18:08:04.802 fe_stable=true 18:08:09.804 fe_stable=true 18:08:14.807 fe_stable=false 18:08:19.800 fe_stable=false 18:08:24.802 fe_stable=false 18:08:29.804 fe_stable=true 18:08:34.807 fe_stable=true 18:08:39.799 fe_stable=false 18:08:44.802 fe_stable=false 18:08:49.805 fe_stable=false 18:08:54.807 fe_stable=true 18:08:59.799 fe_stable=true 18:09:04.801 fe_stable=true 18:09:09.804 fe_stable=false 18:09:14.807 fe_stable=false 18:09:19.799 fe_stable=false 18:09:24.801 fe_stable=true 18:09:29.804 fe_stable=true 18:09:34.807 fe_stable=false 18:09:39.799 fe_stable=false 18:09:44.802 fe_stable=false 18:09:49.804 fe_stable=true 18:09:54.807 fe_stable=true 18:09:59.798 fe_stable=true 18:10:04.801 fe_stable=false 18:10:09.804 fe_stable=false 18:10:14.807 fe_stable=false 18:10:19.798 fe_stable=true 18:10:24.801 fe_stable=false 18:10:29.804 fe_stable=false 18:10:34.807 fe_stable=false 18:10:39.798 fe_stable=true
RTCP crash
When I enable RTCP, roc-send crashes when it gots first feedback report from receiver:
./roc/roc-send -vv -s rtp+rs8m://192.168.0.101:20001 -r rs8m://192.168.0.101:20002 \
-c rtcp://192.168.0.101:20003 -i file:./long.wav
roc-recv -vv -s rtp+rs8m://0.0.0.0:20001 -r rs8m://0.0.0.0:20002 \
-c rtcp://0.0.0.0:20003 \
--target-latency=200ms --min-latency=5ms --max-latency=1000ms --beep
Crash:
18:52:23.180 [2408] [dbg] roc_sndio: [backend_map.cpp:20] backend map: initializing: n_backends=2 n_drivers=58 18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=packet_pool object_size=496 min_slab=8B(1S) max_slab=0B(0S) 18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=992 min_slab=8B(1S) max_slab=0B(0S) 18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=buffer_pool object_size=4136 min_slab=8B(1S) max_slab=0B(0S) 18:52:23.181 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=encoding_pool object_size=240 min_slab=3840B(16S) max_slab=0B(0S) 18:52:23.181 [2409] [dbg] roc_netio: [network_loop.cpp:278] network loop: starting event loop 18:52:23.182 [2410] [dbg] roc_ctl: [control_task_queue.cpp:95] control task queue: starting event loop 18:52:23.182 [2408] [dbg] roc_node: [context.cpp:24] context: initializing 18:52:23.182 [2408] [inf] roc_sndio: [wav_source.cpp:177] wav source: opened input file: path=./long.wav in_bits=16 in_rate=44100 in_ch=2 18:52:23.182 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=slot_pool object_size=1072 min_slab=8B(1S) max_slab=0B(0S) 18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:32] sender node: initializing 18:52:23.182 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiosrc interface of slot 0 18:52:23.182 [2408] [inf] roc_pipeline: [sender_sink.cpp:72] sender sink: adding slot 18:52:23.183 [2408] [dbg] roc_rtp: [identity.cpp:50] rtp identity: ssrc=1540280261 cname=853129e1-8270-4cf8-aa47-bc0d202891db 18:52:23.183 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiosrc interface of slot 0 to rtp+rs8m://192.168.0.101:20001 18:52:23.183 [2409] [dbg] roc_netio: [udp_port.cpp:129] udp port:: opened port 18:52:23.184 [2408] [inf] roc_node: [sender.cpp:201] sender node: bound audiosrc interface to 0.0.0.0:46121 18:52:23.184 [2409] [dbg] roc_netio: [network_loop.cpp:427] network loop: starting sending packets on port 18:52:23.184 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiosrc endpoint rtp+rs8m 18:52:23.184 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audiorpr interface of slot 0 18:52:23.184 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audiorpr interface of slot 0 to rs8m://192.168.0.101:20002 18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audiorpr interface 18:52:23.185 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audiorpr endpoint rs8m 18:52:23.185 [2408] [dbg] roc_fec: [openfec_encoder.cpp:28] openfec encoder: initializing: codec=rs m=8 18:52:23.185 [2408] [dbg] roc_fec: [writer.cpp:91] fec writer: update block size: cur_sbl=0 cur_rbl=0 new_sbl=18 new_rbl=10 18:52:23.185 [2408] [dbg] roc_audio: [packetizer.cpp:60] packetizer: initializing: packet_length=5.000ms samples_per_packet=221 payload_size=884 sample_spec= chset= > 18:52:23.185 [2408] [dbg] roc_audio: [latency_tuner.cpp:199] latency tuner: initializing: target_latency=0(0.000ms) min_latency=0(0.000ms) max_latency=0(0.000ms) latency_upper_limit_coef=1.700000 stale_tolerance=0(0.000ms) scaling_interval=0(0.000ms) scaling_tolerance=0.000000 backend=niq profile=intact 18:52:23.185 [2408] [dbg] roc_node: [sender.cpp:73] sender node: configuring audioctl interface of slot 0 18:52:23.185 [2408] [inf] roc_node: [sender.cpp:121] sender node: connecting audioctl interface of slot 0 to rtcp://192.168.0.101:20003 18:52:23.186 [2408] [dbg] roc_node: [sender.cpp:493] sender node: sharing audiosrc interface port with audioctl interface 18:52:23.186 [2408] [dbg] roc_pipeline: [sender_slot.cpp:61] sender slot: adding audioctl endpoint rtcp 18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=stream_pool object_size=856 min_slab=6848B(8S) max_slab=0B(0S) 18:52:23.186 [2408] [dbg] roc_core: [slab_pool_impl.cpp:61] pool: initializing: name=address_pool object_size=200 min_slab=800B(4S) max_slab=0B(0S) 18:52:23.186 [2408] [dbg] roc_rtcp: [reporter.cpp:69] rtcp reporter: initializing: local_ssrc=1540280261 local_cname="853129e1-8270-4cf8-aa47-bc0d202891db" report_mode=address report_addr=192.168.0.101:20003 timeout=5000.000ms 18:52:23.186 [2408] [dbg] roc_audio: [feedback_monitor.cpp:66] feedback monitor: start gathering feedback 18:52:23.186 [2409] [dbg] roc_netio: [network_loop.cpp:447] network loop: starting receiving packets on port 18:52:23.187 [2408] [dbg] roc_sndio: [pump.cpp:55] pump: starting main loop 18:52:23.187 [2408] [dbg] roc_packet: [router.cpp:103] router: detected new stream: source_id=1540280261 route_flags=[audio] packet_flags=[rtp,fec,audio,prepared,composed] 18:52:23.187 [2408] [dbg] roc_netio: [udp_port.cpp:579] udp port: : recv=0 send=1 send_nb=1 18:52:23.187 [2408] [dbg] roc_rtp: [timestamp_extractor.cpp:82] timestamp extractor: returning mapping: cts:1711983143187832242/sts:1946609469 18:52:23.187 [2408] [dbg] roc_rtcp: [reporter.cpp:1452] rtcp reporter: creating address: remote_addr=192.168.0.101:20003 18:52:23.188 [2408] [dbg] roc_rtcp: [reporter.cpp:1172] rtcp reporter: completed index rebuild: n_streams=0 n_addrs=1 18:52:23.188 [2408] [dbg] roc_rtcp: [communicator.cpp:855] rtcp communicator: generated_pkts=1 processed_pkts=0 proc_errs=0 18:52:23.188 [2408] [dbg] roc_pipeline: [pipeline_loop.cpp:483] pipeline loop: tasks=5 in_place=1.00 in_frame=0.00 preempts=0 sched=0/0 18:52:23.277 [2408] [dbg] roc_packet: [router.cpp:121] router: detected new stream: source_id=none route_flags=[repair] packet_flags=[fec,repair,prepared,composed] 18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:1399] rtcp reporter: creating stream: ssrc=1469140577 18:52:25.197 [2408] [dbg] roc_rtcp: [reporter.cpp:310] rtcp reporter: updating stream address: ssrc=1469140577 old_addr= new_addr=192.168.0.101:20003 18:52:25.197 [2408] [inf] roc_audio: [feedback_monitor.cpp:81] feedback monitor: got first report from receiver: source=1469140577 src/internal_modules/roc_core/optional.h:58: error: roc_panic()
ERROR: roc_audio: optional: attempt to dereference unitialized object
Backtrace: #1: 0x134D25 roc::core::print_backtrace_full()+0x5C #2: 0x133905 roc::core::die_gracefully(char const*, bool)+0x50 #3: 0x12DE85 roc::core::panic(char const*, char const*, int, char const*, ...)+0xCC #4: 0xC2C13 roc::core::Optional<roc::audio::FreqEstimator, 4184u>::operator->() const+0x22 #5: 0xC1FDB roc::audio::LatencyTuner::report_()+0x19A #6: 0xC1B03 roc::audio::LatencyTuner::advance_stream(unsigned int)+0x7A #7: 0xC02A5 roc::audio::FeedbackMonitor::update_tuner_(unsigned int)+0x134 #8: 0xC0013 roc::audio::FeedbackMonitor::write(roc::audio::Frame&)+0x52 #9: 0xBF99F roc::audio::Fanout::write(roc::audio::Frame&)+0x2A #10: 0x7F003 roc::pipeline::SenderSink::write(roc::audio::Frame&)+0x4C #11: 0x7DFA9 roc::pipeline::SenderLoop::process_subframe_imp(roc::audio::Frame&)+0x18 #12: 0x7CB3F roc::pipeline::PipelineLoop::process_next_subframe_(roc::audio::Frame&, unsigned int*, unsigned int)+0xFC #13: 0x7C6B1 roc::pipeline::PipelineLoop::process_subframes_and_tasks_precise_(roc::audio::Frame&)+0x98 #14: 0x7C583 roc::pipeline::PipelineLoop::process_subframes_and_tasks(roc::audio::Frame&)+0x1A #15: 0x7DF07 roc::pipeline::SenderLoop::write(roc::audio::Frame&)+0x146 #16: 0x88FBD roc::sndio::Pump::transfer_frame_(roc::sndio::ISource&)+0x178 #17: 0x88D11 roc::sndio::Pump::run()+0x2AE #18: 0x6FDEB main+0x1FF6 #19: 0xB6AE8740 __libc_start_main+0x114 #20: 0x6DD55 _start+0x20 Aborted
Wi-Fi + noisy link (FEC, standard conservative settings)
Full logs: 20240401_wifi_sane_defaults.zip
Settings:
- Reed-Solomon FEC with default block size
- --target-latency=200ms
- --min-latency=50ms
- --max-latency=1s
(I think these values can be default in roc-recv).
There were no speedtest, however the link itself was noisy since it's evening and everybody around use Wi-Fi / Bluetooth.
rg -IN → | ch 5:
Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)
So we started from 200ms, jumped to 306ms, then after a big jitter spike to 621ms, then gradually rolled back to 311ms (during 12 minutes).
Drops happened only once during that jitter spike. FEC partially recovered those drops.
rg -IN 'watchdog|repaired|→' | perl -ne 'print if /18:57:00/../19:03:43/'
18:57:00.531 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 8820(200.000ms) → 13535(306.910ms)
18:57:00.532 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ..............ibb...
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: ...ibbbbbbbbbbbbbbbb
18:57:44.659 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.681 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.702 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.723 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.745 [1980925] [err] roc_audio: [latency_tuner.cpp:608] Latency tuner: increasing target latency 13535(306.916ms) → 27415(621.666ms)
18:57:44.745 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbD..
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/14/255 ....XXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 0/18/255 XXXXXXXXXXXXXXXXXX .........
18:57:44.862 [1980925] [dbg] roc_fec: [openfec_decoder.cpp:370] openfec decoder: repaired 2/2/255 rr................ .........
18:57:44.862 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: .................ibb
18:57:44.883 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.905 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.926 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.947 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.968 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.980 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:44.990 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.011 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbbbbbbb
18:57:45.032 [1980925] [dbg] roc_audio: [watchdog.cpp:295] watchdog: status: bbbbbbbbbbbbbbi.....
19:00:14.856 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 27415(621.655ms) → 21771(493.667ms)
19:01:09.693 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 21771(493.673ms) → 17289(392.035ms)
19:03:43.729 [1980925] [err] roc_audio: [latency_tuner.cpp:564] Latency monitor: decreasing target latency 17289(392.041ms) → 13730(311.327ms)
Wi-Fi, unstable channel
Streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable.
I'm seeing regular repairs and drops (and crackling), but don't see any reaction from latency tuner.
Sample 1
Sample 2
Sample 3
After a closer look, it seems the last one is really caused by packet loss, as discussed in chat.
So it's not a problem of latency adjustment algorithm, it behaves correctly here.
From wireshark:
5.61% of 314sec = 17.6 secs of lost audio
From our log:
rg -NI 'watchdog: status' | ch -1 | tr -d '\n' | grep -o 'b' | wc -l
67308
67308 blank frames * 8 samples per frame / 44100Hz = 12.21 secs of lost audio
(8 samples used by speex resampler)
Also from our log:
rg -NI cum_loss | ch 13 | tail -1
cum_loss=2301
Wireshark report and our report are close in number and duration of lost packets.
BTW reported max jitter is quite different: 100ms vs 600ms.
The problem can be solved by using larger FEC blocks on sender, e.g. --nbsrc=30 --nbrpr=20
.
If I use --packet-len 7ms --nbsrc 40 --nbrpr 20
on sender, which gives FEC blocks of 280ms, receiver still uses target latency of 200ms for some reason.
(Jitter is below 200ms in that case).
Wi-Fi, unstable channel
I did some more experiments in this setup (streaming from laptop to orange pi connected to 2 different 5ghz access points, in turn connected together via cable).
Some observations:
-
All problems are caused by burst losses.
-
Here is the best configuration that I've found for this particular setup:
- on sender: increase packet_len to 7ms
- on sender: increase FEC block size to 60/20 (420ms)
- on sender: switch from reed-solomon to LDPC-staircase
- on receiver: increase min_latency to 500ms
-
Why:
- when packets are smaller, there are more losses
- when FEC block is smaller, burst losses cause loss of entire block(s)
- when FEC block is larger, and there are losses, reed-solomon is too slow and causes glitches on receiver (device spec)
- FEC works fine only when target latency is larger than FEC block, in this case higher by ~100ms works well
All this is expected, just want to document an actual experiment.
BTW even after doing all this I'm still not satisfied with that setup - sometimes burst losses are too long to be covered by a sane FEC block size. I wonder if RTX would help here (I don't know if in this periods ALL packets are dropped or just MANY of them. Also I don't know yet exact reason of losses - physical link or router queues).
Also I think compression would help here.
During some short period, I was constantly getting this panic:
link meter: seqnum was not processed in writer part
I never restarted sender. I restarted receiver multiple times and got panic each time.
After a short time, panic disappears. And it starts reproducing again for a short time after a minute or two. And so on.
I suppose it's related to seqnum overflows.
First two files in zip are just logs. Second to files also has RTP headers, I added packet->print(0)
to LinkMeter::read().
Here is an example when tuner was too optimistic, reduced latency, and it caused drops of late packets and crackling.
It happened after a long measurement (about 30 minutes). Attached logs have only the final part. See "dropping late packet" in the end of pulse.log.
@gavv I'm about to enable roc-recv
to determine if latency is fixed or it is dynamic.
We've got --target-latency
command line parameter.
- As you proposed, one possible option is to allow it to comprehend
auto
variant, but it wouldn't be possible to specify a starting latency, which would be default in this case. I'd like to keep it though - Another option is to add a bool parameter
--dynamic-latency
which should be set to true by default
If we want to support specifying starting latency in CLI, then we also need it in C API?
There are two approaches for C API:
- target_latency = 0 means adaptive latency, and we add a new field start_latency
- we add a new flag / mode that enables adaptive latency; in this mode target_latency is used as starting latency
I think the second approach is a bit confusing, because target_latency suddenly transforms to starting latency in some cases.
If you agree, then in C API we can have target_latency and start_latency. In this case, I think in CLI we should have the same: --target-latency and --start-latency (because CLI mirrors C API).
If in future version we'll deduce starting latency automatically, this layout will work fine too. In this case start_latency = 0 would mean "deduce starting latency", and start_latency != 0 would mean "use fixed starting latency".
tests: cover new metrics (jitter, total_packets, lost_packets - smoke test that all 3 metrics are available on both sender and receiver)
We need #674 to see these metrics on sender
Pipeline tests test_loopback_sink_2_source
fail with FlagCTS
: timestamp_mapping and timestamp_mapping_remixing
https://github.com/roc-streaming/roc-toolkit/blob/c327926dc0af72c78e15135ae6ac53af75f71fa6/src/tests/roc_pipeline/test_loopback_sink_2_source.cpp#L399-L403
Pipeline tests test_loopback_sink_2_source fail with FlagCTS: timestamp_mapping and timestamp_mapping_remixing
It's actually not related to CTS.
Currently you have:
if (flags & FlagRTCP) {
...
if ((flags & FlagCTS) == 0) {
<check total_packets>
If you look at the tests which set FlagCTS, you'll see:
TEST(loopback_sink_2_source, timestamp_mapping) {
enum { Chans = Chans_Stereo, NumSess = 1 };
send_receive(FlagRTCP | FlagCTS, NumSess, Chans, Chans);
}
TEST(loopback_sink_2_source, timestamp_mapping_remixing) {
enum { FrameChans = Chans_Mono, PacketChans = Chans_Stereo, NumSess = 1 };
send_receive(FlagRTCP | FlagCTS, NumSess, FrameChans, PacketChans);
}
FlagRTCP and FlagCTS are always set together, so your if
effectively disables total_packets check in all tests, i.e. there are no tests where this check actually works.
The difference of 1 packet is actually expected. SenderParticipantMetrics.link.total_packets
is not number of packets generated by sender. It is number of packets expected by receiver, as reported via RTCP feedback from receiver to sender.
(The whole SenderParticipantMetrics is based on RTCP feedback. It's more obvious for other fields, but is surprising for total_packets
field. However, if we would set total_packet differently, it would cause inconsistency with other metrics).
When we call check_metrics(), the latest RTCP report from receiver to sender in not processed by sender yet, hence the difference.
I think in this test, since it is very high-level, we can consider it's an implementation detail (when exactly RTCP is processed) and allow 1 packet delta for the counters.
I suggest to replace this check:
UNSIGNED_LONGS_EQUAL(recv_party_metrics.link.lost_packets,
send_party_metrics.link.lost_packets);
with:
CHECK(send_party_metrics.link.total_packets) >= proxy.n_source() - 1 &&
send_party_metrics.link.total_packets) <= proxy.n_source());
UNSIGNED_LONGS_EQUAL(proxy.n_source(),
recv_party_metrics.link.total_packets);
(It's better to check against PacketProxy::n_source())
We need https://github.com/roc-streaming/roc-toolkit/issues/674 to see these metrics on sender
As I said above, total_packets and other link metrics that you see on sender are from RTCP feedback.
If you run test with -v -t
flags:
roc-test-pipeline -v -t -sn timestamp_mapping
you'll see actual RTCP packets being delivered:
@ packet [0x5631436235f8] [udp,rtcp,control,prepared,composed]
udp: src=<none> dst=127.0.0.1:44 rts=0
rtcp: size=176
@ rtcp packet (176 bytes)
+ rr:
|- header:
|-- version: 2
|-- padding: 0
|-- counter: 1
|-- type: 201
|-- length: 32 bytes (7 words)
|- body:
|-- ssrc: 3996983558
|- block:
|-- ssrc: 3010344341
|-- fract_loss: 0.000000
|-- cum_loss: 0
|-- last_seqnum: 30389
|-- jitter: 36
|-- lsr: 0000c0c02e350000 (unix 2086027840180496215)
|-- dlsr: 00000000000f0000 (unix 228881)
+ xr:
|- header:
|-- version: 2
|-- padding: 0
|-- counter: 0
|-- type: 207
|-- length: 96 bytes (23 words)
|- body:
|-- ssrc: 3996983558
|- rrtr:
|-- block header:
|--- type: 4
|--- type_specific: 0x0
|--- length: 12 bytes (2 words)
|-- block body:
|--- ntp_timestamp: 83b9c0c029905011 (unix 1000000162358287)
|- measurement:
|-- block header:
|--- type: 14
|--- type_specific: 0x0
|--- length: 32 bytes (7 words)
|-- block body:
|--- ssrc: 3010344341
|--- first_sn: 30191
|--- interval_first_sn: 30389
|--- interval_last_sn: 30389
|--- interval_duration: 0000000000000000 (unix 0)
|--- cum_duration: 0000000000000000 (unix 0)
|- delay:
|-- block header:
|--- type: 16
|--- type_specific: 0x40
|--- length: 28 bytes (6 words)
|-- block body:
|--- flag: sample (1)
|--- ssrc: 3010344341
|--- rtt_mean: 00000000000f0000 (unix 228881)
|--- rtt_min: 00000000000f0000 (unix 228881)
|--- rtt_max: 00000000000f0000 (unix 228881)
|--- e2e_latency: 0000000019999999 (unix 99999999)
|- queue:
|-- block header:
|--- type: 220
|--- type_specific: 0x40
|--- length: 16 bytes (3 words)
|-- block body:
|--- flag: sample (1)
|--- ssrc: 3010344341
|--- niq_latency: 0000000004780000 (unix 17456054)
|--- niq_stalling: 0000000000000000 (unix 0)
+ sdes:
|- chunk:
|-- ssrc: 3996983558
|-- item:
|--- type: cname (1)
|--- text: 0425e8c6-21a6-47d3-807f-bba0a9b60742
Here you can see that for example jitter
is non-zero.
I think in test_loopback_sink_2_source it's enough to check that:
- jitter is non-zero
- if FlagLosses is enabled, lost_packets becomes non-zero after a while
- jitter, lost_packets, total_packets have close values on sender and receiver (but not strictly equal)
Since we plan to have good coverage for these metrics in test_receiver_source and test_sender_sink, in test_loopback_sink_2_source it's enough to do only a very basic check.
/** Total amount of packets sent or expected to be received.
*
* On sender, this counter is just incremented every packet.
* On receiver, it is derived from seqnums.
*/
So this comment is wrong.
Maybe expected_packets
would be a better name for this metrics?
With a comment like:
Total amount of packets that receiver expects to be delivered.
QA: https://github.com/roc-streaming/qa/tree/main/manual/20241011_gh688_adaptive_latency