ustreamer icon indicating copy to clipboard operation
ustreamer copied to clipboard

Separate thread for rtp_wrap_h264/relay, detecting missed writes

Open aggieNick02 opened this issue 3 years ago • 15 comments

I finally got around to this. What do you think? The separate thread is off by default. Turning it on definitely makes missed written frames much less likely, which is helpful when frames are bigger or at higher frequencies.

I also added two fields to the shared memory structure that make it easy to detect when a write is missed even though there are clients. It then prints a line to the log when it happens, which is really helpful for detecting the issue, versus watching encoded video and trying to spot it visually.

aggieNick02 avatar Sep 14 '21 16:09 aggieNick02

Sup. I'm very busy right now, so I'll look at it later. Sorry.

mdevaev avatar Sep 20 '21 07:09 mdevaev

Of course, I can imagine things are quite busy especially with the Kickstarter launch (I'm eagerly in that queue. :-) ) Thanks for the update.

aggieNick02 avatar Sep 20 '21 15:09 aggieNick02

JFYI I don't forget about it.

mdevaev avatar Oct 20 '21 02:10 mdevaev

I was completely exhausted, but I didn't forget about it. If you are not disappointed in the speed of my work yet, could you remind me a little bit about why we can't just increase the polling rate?

mdevaev avatar Jul 06 '22 23:07 mdevaev

Of course. I had to go back and remember why too. The details were in https://github.com/pikvm/ustreamer/issues/115 but were mixed in with a lot of other details on other stuff too. ;-)

The core of the issue is that even if we increase the polling rate, the time taken by invoking the Janus callback g_gw->relay_rtp can spike the time taken by the code to 20 ms or even higher. So even if we drop the polling rate to something really low like 5ms, the spikes from the Janus callback can make us miss frames. Having several (15) browser windows all pointing to the same video feed is a good way to force the issue and see the effects even with the standard ustreamer setup/encoding.

Pushing the rtp_wrap_h264 calls to another thread and using a ringbuffer to communicate with it means that _clients_thread is now only responsible for reading the memory sink, and thus missing a write to the sink becomes significantly less likely. As long as the average time for rtp_wrap_h264 in the other thread is less than the video framerate, and the ringbuffer has sufficient capacity, we won't drop any frames, even if there are occasionally large spikes in rtp_wrap_h264 time.

aggieNick02 avatar Jul 07 '22 15:07 aggieNick02

We can make it even simpler and potentially more scalable. What do you think if relay_rtp() for each client will work in its own separate thread?

mdevaev avatar Jul 07 '22 16:07 mdevaev

I think that would be fine, and it does seem like there would be some tradeoffs. Would each thread have its own ringbuffer? Memory usage would be higher and the time in _clients_thread would scale linearly with the number of clients as memory was copied to each ringbuffer. But a big positive is that one poorly behaving client would not be able to ruin the others.

I would expect the time for the memory copies to each ring buffer would be small enough that even scaling linearly with # of clients would be fine.

aggieNick02 avatar Jul 07 '22 16:07 aggieNick02

I plan to use a regular queue, as it is now done in audio.c. But yes, every client will have their own queue.

mdevaev avatar Jul 07 '22 18:07 mdevaev

I plan to use a regular queue, as it is now done in audio.c. But yes, every client will have their own queue.

Sounds great to me. :-)

aggieNick02 avatar Jul 08 '22 17:07 aggieNick02

Okay, I'll implement it and you can test it.

mdevaev avatar Jul 08 '22 17:07 mdevaev

Okay, I made it. Could you test? See the master commits.

mdevaev avatar Jul 10 '22 03:07 mdevaev

Okay, I made it. Could you test? See the master commits.

Of course. I just got back from being out and it's been a while since I've played with my external encoding setup, so it may take a few days until I get to it, but it's on my list to take care of soon.

aggieNick02 avatar Jul 12 '22 19:07 aggieNick02

Cool. Today I commited the last thing so I believe everything should work now.

mdevaev avatar Jul 13 '22 03:07 mdevaev

Hey just FYI this is still on my todo but I haven't gotten to it yet. It may be a few weeks if that's not too much trouble. I don't currently have my pikvms set up for active dev, so it'll take me a few hours to get everything up and test, and finding those hours is tight at the moment.

If validating this becomes urgent on your end just let me know and I'll try to get it done sooner. Thanks again for keeping it in mind and extending the audio solution to solve the h264 issues I had as well.

aggieNick02 avatar Jul 22 '22 00:07 aggieNick02

There are no problems and I'm not in a hurry with this. My recent fixes with queues have made it possible to stream to 40 clients. Before the queues were implemented, there were a lot of frame skips. I guess I solved your problem too.

mdevaev avatar Jul 22 '22 00:07 mdevaev

Boop :)

mdevaev avatar Aug 23 '22 21:08 mdevaev

Sorry, thanks for the reminder. I'll try to get this done in the next couple days.

aggieNick02 avatar Aug 24 '22 19:08 aggieNick02

Hoped to get to it today, but ran out of time. By Tuesday at the latest; thanks for the patience.

aggieNick02 avatar Aug 26 '22 19:08 aggieNick02

Finally got a chance to play with this and validate today. The code looks great, and the performance is good.

There's definitely an increase in lag as the number of clients gets big, but that's not really avoidable, there's only so much a pi can do. :-) I had 20 clients streaming a 1080p animated video without any corruption and it was still totally usable, just with a bit more lag. That's quite impressive.

Thanks for fixing this! :-)

aggieNick02 avatar Aug 30 '22 20:08 aggieNick02

Great! Thank you for your work here, you have helped me a lot.

mdevaev avatar Aug 30 '22 20:08 mdevaev