srt icon indicating copy to clipboard operation
srt copied to clipboard

Connecting SRT caller through nginx-ingress

Open tab1293 opened this issue 10 months ago • 5 comments

Hello all,

I'm attempting to connect to a SRT listener that is sitting behind nginx-ingress. I have configured nginx-ingress to forward UDP traffic on port 4000 to this SRT listener workload. While I am able to connect over UDP port 4000 using a simple netcat test, I am unable connect via srt_connect.

Here is what my netcat test looks like:

On listener node:

# nc -u -l 0.0.0.0 4000
test
test
test

On caller node:

# while true; do echo "test"; sleep 2; done | nc -u 35.245.186.147 4000

But when I try to connect to the SRT listener on the same port, I get a connection timeout. Here is what the SRT caller test looks like (I am using tsduck for this test):

On listener node:

# tsp -v -I srt --caller 35.194.65.144:7000 --rcvbuf 187500000000 -P regulate -O srt 0.0.0.0:4000

On caller node:

# tsp -v  --debug=4  -I srt --caller 35.245.186.147:4000 --transtype live -O play
tsp: Debug[4]: debug level set to 4
tsp: Debug: ====> tsp -v --debug=4
* Debug[4]: debug level set to 4
* Debug[4]: srt: debug level set to 4
* Debug: srt: ====> tsp -I srt --caller 35.245.186.147:4000 --transtype live
* Debug: trying to load "/opt/homebrew/Cellar/tsduck/3.35-3258/lib/tsduck/tsplugin_play.dylib"
* Debug: registering output plugin "play", status: ok
* Debug[4]: play: debug level set to 4
* Debug: play: ====> tsp -O play
* Debug: tsp: buffer size: 89,240 TS packets, 16,777,120 bytes
* Debug: srt: calling srt_create_socket()
* Debug: srt: calling srt_setsockflag(SRTO_TRANSTYPE, 00 00 00 00, 4)
* Debug: srt: calling srt_setsockflag(SRTO_MESSAGEAPI, 01, 1)
* Debug: srt: calling srt_connect(35.245.186.147:4000)
* Error: srt: error during srt_connect: Connection setup failure: connection timed out
* Debug: srt: SRTSocket::open, sock = 0x0C5BCA54, listener = 0xFFFFFFFF
* Debug: srt: SRTSocket::close, sock = 0x0C5BCA54, listener = 0xFFFFFFFF, final stats: false
* Debug: srt: calling srt_close()
* Debug: start() error in plugin srt
* Debug: srt: SRTSocket::close, sock = 0xFFFFFFFF, listener = 0xFFFFFFFF, final stats: false
* Debug: Report logging thread terminated

Here is what srt-live-transmit caller looks like:

# srt-live-transmit -v -t 5 'srt://35.245.186.147:4000?mode=caller' udp://127.0.0.1:9000
TIMEOUT: will interrupt after 5s
Media path: 'srt://35.245.186.147:4000?mode=caller' --> 'udp://127.0.0.1:9000'
SRT parameters specified:

	mode = 'caller'
Opening SRT source caller on 35.245.186.147:4000
Connecting to 35.245.186.147:4000
SrtCommon: DESTROYING CONNECTION, closing sockets (rt%753684003 ls%-1)...
SrtCommon: ... done.
SRT parameters specified:

	mode = 'caller'
Opening SRT source caller on 35.245.186.147:4000
Connecting to 35.245.186.147:4000

---------- INTERRUPT ON TIMEOUT!

SrtCommon: DESTROYING CONNECTION, closing sockets (rt%753684002 ls%-1)...
SrtCommon: ... done.

Can anyone help me understand why a SRT caller connection specifically cannot be established through nginx-ingress, while a basic netcat UDP connection test works?

tab1293 avatar Feb 28 '25 20:02 tab1293

It seems like maybe because the Peer IP Address in the handshake packet does not match the source IP address of the nginx ingress controller pod (10.64.2.35), the connection may be dropped?

Image

tab1293 avatar Mar 01 '25 19:03 tab1293

Here's a bit more debugging info...

For the listener I'm running:

srt-live-transmit -loglevel:debug udp://:1234 srt://:4000

with the source content being generated via ffmpeg and sent locally to udp://:1243

Here are the corresponding listener logs:

01:51:16.800701/SRT:RcvQ:w1 D:SRT.qr: INCOMING PACKET: FROM=10.64.2.44:57811 BOUND=0.0.0.0:4000 TARGET=@0 CONTROL: size=48 type=handshake HS: version=4 type=0x2 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=0 srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.
01:51:16.800808/SRT:RcvQ:w1 D:SRT.cn: Got sockID=0 from 10.64.2.44:57811 - trying to resolve it as a connection request...
01:51:16.800817/SRT:RcvQ:w1 D:SRT.cn: PASSING request from: 10.64.2.44:57811 to listener:414958486
01:51:16.800823/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: received a connection request
01:51:16.800837/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: new cookie: 7aa81c5a
01:51:16.800844/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: received type=induction, sending back with cookie+socket
01:51:16.800850/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: NOT  Advertising PBKEYLEN - value = 0
01:51:16.800856/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: SENDING HS (i): version=5 type=0x4a17 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS: MAGIC
01:51:16.800865/SRT:RcvQ:w1 D:SRT.ks: CChannel::sendto: SENDING NOW DST=10.64.2.44:57811 target=@805305282 size=48 pkt.ts=12321562 TARGET=@805305282 CONTROL: size=48 type=handshake HS: version=5 type=0x4a17 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS: MAGIC
01:51:16.800899/SRT:RcvQ:w1 D:SRT.cn: Listener got the connection request from: 10.64.2.44:57811 result:waveahand
01:51:16.800906/SRT:RcvQ:w1 D:SRT.qr: worker: result for the unit: INDUCED/CONCLUDING
01:51:16.800912/SRT:RcvQ:w1 D:SRT.qr: worker: RECEIVED PACKET --> updateConnStatus. cst=INDUCED/CONCLUDING id=0 pkt-payload-size=48
01:51:16.810499/srt-live-transm D:SRT.ea: CEPoll::wait: EVENT WAITING: CHECKPOINT
01:51:16.810540/srt-live-transm D:SRT.ea: CEPoll::wait: REPORTED 0/0
01:51:16.810549/srt-live-transm D:SRT.ea: CEPoll::wait: LINUX: picking up 0 ready fds.
01:51:16.810554/srt-live-transm D:SRT.ea: CEPoll::wait: Total of 0 READY SOCKETS
01:51:16.816398/SRT:RcvQ:w1 D:SRT.qr: INCOMING PACKET: FROM=10.64.2.44:53949 BOUND=0.0.0.0:4000 TARGET=@0 CONTROL: size=64 type=handshake HS: version=5 type=0x1 ISN=500412327 MSS=1500 FLW=8192 reqtype=conclusion srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS:  hsx no-pbklen
01:51:16.816452/SRT:RcvQ:w1 D:SRT.cn: Got sockID=0 from 10.64.2.44:53949 - trying to resolve it as a connection request...
01:51:16.816459/SRT:RcvQ:w1 D:SRT.cn: PASSING request from: 10.64.2.44:53949 to listener:414958486
01:51:16.816466/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: received a connection request
01:51:16.816475/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: new cookie: d067a390
01:51:16.816482/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: received type=conclusion - checking cookie...
01:51:16.816489/SRT:RcvQ:w1 D:SRT.cn: @414958486: processConnectRequest: ...wrong cookie 82850470. Ignoring.
01:51:16.816511/SRT:RcvQ:w1 D:SRT.cn: Listener got the connection request from: 10.64.2.44:53949 result:INVALID
01:51:16.816518/SRT:RcvQ:w1 D:SRT.qr: worker: result for the unit: REJECTED
01:51:16.816523/SRT:RcvQ:w1 D:SRT.qr: worker: RECEIVED PACKET --> updateConnStatus. cst=REJECTED id=0 pkt-payload-size=64

On the caller I'm running:

srt-live-transmit -loglevel:debug -t 5 'srt://35.194.65.144:4000?mode=caller' udp://127.0.0.1:9000

Here are the corresponding caller logs:

20:51:16.777366/ D:SRT.sm: generateSocketID: : @805305282
20:51:16.777591/ D:SRT.sm: @805305282:newSocket: mapping socket 805305282
20:51:16.777129/ D:SRT.sm: checkBrokenSockets: after removal: m_ClosedSockets.size()=0
20:51:16.777603/ D:SRT.in: GC: sleep 1 s
20:51:16.777618/ D:SRT.ac: @805305282: OPTION: #2 value:00
20:51:16.777650/ D:SRT.ac: @805305282: OPTION: #49 value:B0500
20:51:16.777668/ D:SRT.cn: @805305282: clearData: PAYLOAD SIZE: 1488
20:51:16.777861/ D:SRT.km: CHANNEL: Bound to local address: 0.0.0.0:0
20:51:16.777922/ D:SRT.sm: bind: creating new multiplexer for port 51279
20:51:16.778127/ D:SRT.ac: @805305282: startConnect: -> 35.245.186.147:4000 (ASYNCHRONOUS)...
20:51:16.778146/ D:SRT.cn: registerConnector: adding @805305282 addr=35.245.186.147:4000 TTL=02:36:43.4221976 [STDY]
20:51:16.778166/ D:SRT.cn: RID: adding socket @805305282 for address: 35.245.186.147:4000 expires: 02:36:43.4221976 [STDY] (total connectors: 1)
20:51:16.778171/ D:SRT.ac: @805305282: startConnect: ISN generated = 500412327
20:51:16.778175/ D:SRT.cn: @805305282: CUDT::startConnect: REQ-TIME set HIGH (TimeStamp: 502). SENDING HS: version=4 type=0x2 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=0 srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.
20:51:16.778181/ D:SRT.ks: CChannel::sendto: SENDING NOW DST=35.245.186.147:4000 target=@0 size=48 pkt.ts=502 TARGET=@0 CONTROL: size=48 type=handshake HS: version=4 type=0x2 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=0 srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.
20:51:16.778284/ D:SRT.cn: @805305282: startConnect: ASYNC MODE DETECTED. Deferring the process to RcvQ:worker
20:51:16.778288/ D:SRT.ac: @805305282: OPTION: #2 value:00
20:51:16.778307/ D:SRT.ea: srt_epoll_update_usock: ADDED E1 for @805305282 [R][E]
20:51:16.778323/ D:SRT.ea: CEPoll::wait: REPORTED 0/0
20:51:16.778327/ D:SRT.ea: CEPoll::wait: Total of 0 READY SOCKETS
20:51:16.788958/ D:SRT.cn: updateConnStatus: updating after getting pkt with DST socket ID @0 status: AGAIN
20:51:16.789007/ D:SRT.cn: RID: socket @805305282 still active (remaining 2.989138s of TTL)...
20:51:16.789023/ D:SRT.cn: RID:@805305282 10.832000ms passed since last connection request.
20:51:16.790839/ D:SRT.ea: CEPoll::wait: EVENT WAITING: CHECKPOINT
20:51:16.790845/ D:SRT.ea: CEPoll::wait: REPORTED 0/0
20:51:16.790849/ D:SRT.ea: CEPoll::wait: Total of 0 READY SOCKETS
20:51:16.800063/ D:SRT.cn: updateConnStatus: updating after getting pkt with DST socket ID @0 status: AGAIN
20:51:16.800101/ D:SRT.cn: RID: socket @805305282 still active (remaining 2.978045s of TTL)...
20:51:16.800121/ D:SRT.cn: RID:@805305282 21.926000ms passed since last connection request.
20:51:16.801681/ D:SRT.qr: INCOMING PACKET: FROM=35.245.186.147:4000 BOUND=0.0.0.0:0 TARGET=@805305282 CONTROL: size=48 type=handshake HS: version=5 type=0x4a17 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS: MAGIC
20:51:16.801701/ D:SRT.cn: worker_ProcessAddressedPacket: resending to QUEUED socket @805305282
20:51:16.801705/ D:SRT.cn: RID: found id @805305282 while looking for THIS ID FROM 35.245.186.147:4000
20:51:16.801709/ D:SRT.cn: AsyncOrRND: packet RESOLVED TO @805305282 -- continuing as ASYNC CONNECT
20:51:16.801726/ D:SRT.cn: @805305282: processAsyncConnectResponse: got response for connect request, processing
20:51:16.801743/ D:SRT.cn: @805305282: processConnectResponse: TYPE:handshake
20:51:16.801746/ D:SRT.cn: @805305282: processConnectResponse: HS RECEIVED: version=5 type=0x4a17 ISN=500412327 MSS=1500 FLW=8192 reqtype=induction srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS: MAGIC
20:51:16.801755/ D:SRT.cn: @805305282: processConnectResponse: REQ-TIME LOW; got INDUCTION HS response (cookie:7aa81c5a version:5), sending CONCLUSION HS with this cookie
20:51:16.801760/ D:SRT.cn: @805305282: processConnectResponse: No encryption flags found in type field: 18967
20:51:16.801787/ D:SRT.rs: @805305282: createCrypter: setting RCV/SND KeyLen=0
20:51:16.801791/ D:SRT.cn: CCryptoControl::init: HS SIDE:INITIATOR DIRECTION:BOTH
20:51:16.801794/ D:SRT.cn: CCryptoControl::init: CAN'T CREATE crypto: key length for SND = 0
20:51:16.801797/ D:SRT.cn: @805305282: processAsyncConnectResponse: response processing result: INDUCED/CONCLUDING; REQ-TIME LOW to enforce immediate response
20:51:16.801802/ D:SRT.qr: worker: result for the unit: INDUCED/CONCLUDING
20:51:16.801806/ D:SRT.qr: worker: RECEIVED PACKET --> updateConnStatus. cst=INDUCED/CONCLUDING id=805305282 pkt-payload-size=48
20:51:16.801811/ D:SRT.cn: updateConnStatus: updating after getting pkt with DST socket ID @805305282 status: INDUCED/CONCLUDING
20:51:16.801814/ D:SRT.cn: RID: socket @805305282 still active (remaining 2.976331s of TTL)...
20:51:16.801818/ D:SRT.cn: RID:@805305282 cst=INDUCED/CONCLUDING -- repeating connection request.
20:51:16.801822/ D:SRT.cn: updateConnStatus: collected 1 for processing, 0 to close
20:51:16.801825/ D:SRT.cn: updateConnStatus: cst=INDUCED/CONCLUDING for @805305282
20:51:16.801828/ D:SRT.cn: updateConnStatus: processing async conn for @805305282 FROM 35.245.186.147:4000
20:51:16.801833/ D:SRT.cn: @805305282: processAsyncConnectRequest: REQ-TIME: HIGH. Should prevent too quick responses.
20:51:16.801837/ D:SRT.cn: @805305282: processAsyncConnectRequest: serializing HS: buffer size=1488
20:51:16.801853/ D:SRT.cn: @805305282: createSrtHandshake: buf size=1488 hsx=EXT:hsreq kmx=EXT:kmreq kmdata_wordsize=0 version=5
20:51:16.801857/ D:SRT.cn: @805305282: createSrtHandshake: NOT  Advertising PBKEYLEN - value = 0
20:51:16.801861/ D:SRT.cn: @805305282: createSrtHandshake: (ext: HSX) data: version=5 type=0x1 ISN=500412327 MSS=1500 FLW=8192 reqtype=conclusion srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS:  hsx no-pbklen
20:51:16.801870/ D:SRT.cn: @805305282: HSREQ/snd: LATENCY[SND:0 RCV:120] FLAGS[+TSBPD-snd +TSBPD-rcv +haicrypt +TLPktDrop +NAKReport +ReXmitFlag -StreamAPI +FilterCapable ]
20:51:16.801876/ D:SRT.cn: @805305282: createSrtHandshake: after HSREQ: offset=13 HSREQ size=3 space left: 359
20:51:16.801881/ D:SRT.cn: @805305282: createSrtHandshake: filled HSv5 handshake flags:  hsx no-pbklen length: 64 bytes
20:51:16.801885/ D:SRT.cn: @805305282: processAsyncConnectRequest: sending HS reqtype=conclusion to socket 0 size=64
20:51:16.801890/ D:SRT.cn: @805305282: processAsyncConnectRequest: setting REQ-TIME HIGH, SENDING HS:version=5 type=0x1 ISN=500412327 MSS=1500 FLW=8192 reqtype=conclusion srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS:  hsx no-pbklen
20:51:16.801898/ D:SRT.ks: CChannel::sendto: SENDING NOW DST=35.245.186.147:4000 target=@0 size=64 pkt.ts=24161 TARGET=@0 CONTROL: size=64 type=handshake HS: version=5 type=0x1 ISN=500412327 MSS=1500 FLW=8192 reqtype=conclusion srcID=805305282 cookie=7aa81c5a srcIP=35.245.186.147.0.0.0.0.0.0.0.0.0.0.0.0.FLAGS:  hsx no-pbklen
20:51:16.803361/ D:SRT.ea: CEPoll::wait: EVENT WAITING: CHECKPOINT
20:51:16.803368/ D:SRT.ea: CEPoll::wait: REPORTED 0/0
20:51:16.803371/ D:SRT.ea: CEPoll::wait: Total of 0 READY SOCKETS
20:51:16.812956/ D:SRT.cn: updateConnStatus: updating after getting pkt with DST socket ID @805305282 status: INDUCED/CONCLUDING
20:51:16.812990/ D:SRT.cn: RID: socket @805305282 still active (remaining 2.965156s of TTL)...
20:51:16.812997/ D:SRT.cn: RID:@805305282 11.091000ms passed since last connection request.
20:51:16.815454/ D:SRT.ea: CEPoll::wait: EVENT WAITING: CHECKPOINT
20:51:16.815461/ D:SRT.ea: CEPoll::wait: REPORTED 0/0
20:51:16.815465/ D:SRT.ea: CEPoll::wait: Total of 0 READY SOCKETS
20:51:16.824031/ D:SRT.cn: updateConnStatus: updating after getting pkt with DST socket ID @805305282 status: INDUCED/CONCLUDING
20:51:16.824071/ D:SRT.cn: RID: socket @805305282 still active (remaining 2.954074s of TTL)...

So it seems like something is going wrong during the handshake. Specifically something around the cookie being invalid and thus rejected:

@414958486: processConnectRequest: ...wrong cookie 82850470. Ignoring.
Listener got the connection request from: 10.64.2.44:53949 result:INVALID
worker: result for the unit: REJECTED
worker: RECEIVED PACKET --> updateConnStatus. cst=REJECTED id=0 pkt-payload-size=64

Can anyone help explain why the cookie is being rejected? It seems like something with nginx-ingress being the middleman here is throwing this handshake off? I'm trying to figure if there is some nginx-ingress configuration I need to tweak or maybe if there is a way to ignore this cookie check?

Edit: these are the relevant IPs from the logs

10.64.2.44      - nginx-ingress-controller internal pod IP
35.245.186.147  - nginx-ingress-controller public IP 

tab1293 avatar Mar 02 '25 02:03 tab1293

Did you set up the UDP traffic forwarding in both directions?

Have you noticed that in both handskahe requests (induction and conclusion) you have a different remote port number?

ethouris avatar Mar 03 '25 09:03 ethouris

@ethouris I've confirmed UDP traffic is working in both directions. I did this with a simple python script that's running on the same pod as my SRT listener:

#!/usr/bin/python3
import socket,struct
def loop_on_socket(s):
  while True:
    d, addr = s.recvfrom(1500)
    print(d, addr)
    s.sendto("ECHO: ".encode('utf8')+d, addr)

if __name__ == "__main__":
   HOST, PORT = "0.0.0.0", 4000
   sock = socket.socket(type=socket.SocketKind.SOCK_DGRAM)
   sock.bind((HOST, PORT))
   loop_on_socket(sock)

And then locally:

# nc -u 34.150.223.193 4000
hello
ECHO: hello

(Note: external ingress IP updated to 34.150.223.193 since last post)

But good catch with the different remote port number. Is this why the handshake is failing? I'm not sure if it's possible to have ingress-nginx keep a consistent sending port as it forwards traffic.

tab1293 avatar Mar 03 '25 20:03 tab1293

Not sure if it was clearly described in any documentation, but what you actually need for SRT to function is a "bidirectional UDP link".

So, for example, if you have a caller, and you are starting the handshake process, you need to get reserved IP:PORT pair on the source side (binding address - can be assigned automatically) and attempt to connect to another IP:PORT pair to the destination listener. Same on the listener side - if the listener receives a UDP packet from a certain IP:PORT pair, then this one defines the link (the source address on the caller side, in particular). The first packet with a special 0 value of the socket is a connection request, so there are no restrictions on it, but once it receives that kind of packet from a certain IP:PORT, it expects that the exactly same pair of IP:PORT is present in the source UDP address to be treated as a request sent by the same application. If it comes from a different IP:PORT, it's a different link and potentially a different application, so SRT expects, as the first and connection-unrelated packet, the packet with INDUCTION handshake with connection request, not a further CONCLUSION handshake - that one is rejected.

It is possible to "trick" SRT, but it must just "look like", by the IP:PORT pair for both source and destination in all UDP packets done in exchange in a single UDP bidirectional link. To make a clerer example:

Let's say you have a listener on 10.0.0.10:10000, and a caller on IP address 1.2.3.4.

The caller starts the connection, the system binds the socket to 0.0.0.0:54321 address.

The caller sends the UDP packet with handshake/induction, source: 1.2.3.4:54321, target: 10.0.0.10:10000.

The listener receives that packet and sends handshake/induction response with a cookie, source: 10.0.0.10:10000, target: 1.2.3.4:54321.

The caller then sends the UDP packet with handshake/conclusion, source 1.2.3.4:54321, target: 10.0.0.10:10000, with the previously received cookie.

The listener then must receive a UDP packet that has source: 1.2.3.4:54321, with handshake/conclusion, carrying the cookie. Otherwise the packet will be rejected.

You can then make whatever address translation you need, so for example, you will get 1.2.3.4:54321 translated into 10.0.0.4:12345, and the listener will get UDP packet with that already address. But the next UDP packet with source 1.2.3.4:54321 must be always translated into 10.0.0.4:12345 and to really nothing else, and also translated back to 1.2.3.4:54321 when the listener sends packets back to the caller. If that stable assignment doesn't hold, SRT will not work.

ethouris avatar Mar 04 '25 07:03 ethouris