srt icon indicating copy to clipboard operation
srt copied to clipboard

Maximizing receive throughput for single listener

Open ccbrown opened this issue 1 year ago • 2 comments

I'm using SRT to transmit some very high bitrate streams and finding that getting more than 1 gbps throughput total on a single server with one listener is very challenging (tested on m5zn.xlarge AWS instances).

As I understand it, SRT uses a single receive thread for all connections, so even with lots of reasonably sized streams, we hit a wall at the same overall throughput when the RcvQ thread hits 100% CPU utilization (using SRT as a library here):

Screen Shot 2022-08-20 at 19 17 15

Of course I can load balance streams across multiple servers or listener processes (and I am), but still, I'd like to ask: Is there any way to scale SRT to higher bitrates with a single listener? E.g. is there any way to parallelize the work that RcvQ is doing to take advantage of more cores?

ccbrown avatar Aug 21 '22 04:08 ccbrown

Possibly this would be a use-case for my suggestion here https://github.com/Haivision/srt/issues/2399

I.e. you could use libevent/IOCP or something to do the UDP heavy-lifting, making use of threads in the most efficient way possible for the platform, and SRT would only be concerned with the protocol.

oviano avatar Sep 02 '22 11:09 oviano

A single receive thread for all SRT connections "bound" to the same UDP port is currently a known bottleneck of the existing SRT implementation. The same thread also keeps track of some timers and events. See also SRT threading. This approach works quite well with a small number of connections but hits limits when the number of co-existing SRT connections bound to the same UDP port grows.

For the time being, I can suggest considering alternative implementations:

  • RUST-based implementation srt-rs.
  • Go-based implementation gosrt.

maxsharabayko avatar Sep 22 '22 14:09 maxsharabayko