srt [FR] AccessControl Decisions Outside of accept callback

The AccessControl concept has a huge limitation of requiring that the Acess be accepted quickly in the accept callback.

It would be more useful, if we could accept the socket, but allow data to flow at potentially a much later time. For instance maybe it needs to take time (and a significant time) for the application to decide whether or not to actually accept the connection and allow data to flow. For instance maybe the application would want to create a pop-up for the user to decide? Or perhaps query credentials from a database? Or some other decision making operation that could take an indeterminate amount of time.

In that case, it would be useful for the accept callback to call srt_accept() to create the socket for the connection, but not actually use it for a VERY long time (e.g. the user is deciding) or decide after a VERY long time not to accept it because the user decided not too. It seems like it would be useful to be able to create the local socket and accept the connection, but not allow data to flow until the application business logic decides after some unspecified period of time to allow or reject?

Apr 07 '22 17:04 jlsantiago0

Interesting idea. Currently, the listener callback blocks the conclusion response handshake, because the decision has to be sent back to the caller: accept or reject. To do this type of feature, probably, a new state has to be created and signaled from the callback. Something like "connection pending", where a caller is notified that the listener has received the conclusion request, but needs some time to process it. Up until then, the connection is still pending.

May 03 '22 13:05 maxsharabayko

Yes, that problem has popped up at the time when this mechanism was created; although its unfortunate influence on the receiver worker queue of the listener holder, this is considered sufficient for all the simple cases, while it doesn't break the general handshake rule and the overall rules of the API. Any such changes would require reinventing the API top-to-bottom. Note that this API is modelled after TCP, and even TCP didn't predict such a thing as rejecting the connection - it was only possible to implement because it could have been modelled after a general "no response/connection timeout" error. The condition is that any extra things performed in the meantime must be extremely quick, otherwise some of the API rules are broken.

But this problem can be easily overcome by the application with the use of the "session" term and multiple connections.

The idea is that first you make a connection to the destination server and specify t=auth with also r specifying what kind of sensible data exchange you are going to make. Then the sensible data are being exchanged through this connection and in result the caller side should receive the session ID string and the passphrase. The client then makes a new connection to the same endpoint, sets the passphrase as received, and places s=session-ID in the streamid string.

May 03 '22 17:05 ethouris

OK. SO I am trying to understand your proposal. First the client connects as caller to the server and sends a t=auth + r as part of the streamID, then the server sends over the regular data channel some OOB data that can be interpreted by the client however it wishes? Then the client drops connection and reestablishes as caller with a different StreamID and using any passphrase info that may have been given it by the server over the previous connection's data channel and then the regular stream of data is sent over the new connection?

If this is your suggestion, we can implement anything we like at the application level using this 2 stage process?

May 03 '22 21:05 jlsantiago0

This can be actually simple - the session ID doesn't have to be actually encrypted, if the application makes sure that this is set on the connection run by exactly the same application (uses the same source IP:port) without having closed the auth connection (and rejects the connection using that session ID otherwise), and the passphrase could be set as the passphrase assigned to the username that was used in the auth connection. In other words:

Auth connection: streamid contains t=auth,u=<your-username>. Server responds with the session-id. The client keeps the connection and can wait even infinitely for the answer.
Stream transmission connection: streamid contains s=<session-id> and the passphrase option is set to the passphrase for that user. Once the connection is established, the auth connection should be closed.

That method should have some sensible name so that the server knows what method of authentication should be used, and that name should be then used in the r key for the auth connection.

You may as well use such a method that the client and the server exchange public keys first. Then the client sends to the server the user:password phrase encrypted by the server's public key, and then the server sends back the session-id:passphrase (both random-generated) response encrypted by the client's public key. The client uses then this session-id in s key in streamid and sets the passphrase to that passphrase. This can be also used as an alternative method with different name, or the server could support only one of them.

Note also that in the listener handler you can also set explicitly a rejection code, and something appropriate for a case of "unsupported authentication method" should be there.

May 04 '22 05:05 ethouris

Guess I am a bit confused. From what I understand:

The client is the caller making the Auth connection with StreamID t=auth,u=<user-name> . The server allows this connection in the srt_listen_callback() i assume.
Then how does the server then send the streamID? Does it srt_accept() this socket after the srt_listen_callback() has returned? and then do an srt_send() to the client on that socket? that must then be srt_recv() on the client side and the client must then interpret that data as non-stream data, but as a streamID to use for the stream transmission connection?
Then the client makes a new caller connection that is the Stream transmission connection with streamID as s=<session-id>? Also this connection is the one that would use encryption assigned to the username from the Auth connection?

Is this correct?

May 04 '22 14:05 jlsantiago0

Exactly.

If any lengthy things should be done in order to get the user data, and the required passphrase, it can be prepared at this moment, cached under the session-id key, do that they can be quickly picked up when the transmission connection is being made. If a connection is recognized with s=<session-id>, the data from this key will be extracted, including the passphrase, and the passphrase will be set to this connection in the listener callback.

May 04 '22 15:05 ethouris

Great. That makes sense. The only question I have, is how does this work in Rendezvous mode? Or maybe it cant?

May 04 '22 15:05 jlsantiago0

I don't think so. In Rendezvous mode you also don't know which party will be setting the streamid, that's why the streamid itself is not recommended to be used for rendezvous. Rendezvous also doesn't fit in any client-server workflow model, as well as you can't establish multiple connections on one UDP link, unlike in case of caller-listener layout (ah, didn't tell you: recommended is also that for both connections the caller party use the same outgoing port; if you want to let it be autoselected, then the caller should read the outgoing port number from the socket of the first connection and enforce it on the second one).

The use of rendezvous for a model like this would make more sense if you use also some signaling server in the public internet, to which both parties, for example being behind the NAT, will connect and interchange appropriate data, then they will use the endpoint of one another as provided by the signaling server directly, without specifying anything in the streamid.

May 04 '22 15:05 ethouris

OK. Thank you for the detailed explanation. I assume that the 3rd party signaler in this case would be an SRT listener and each side would call that and the authentication information would be sent from the signaler would then exchange the information in both directions by simply sending the StreamID of the one side to the other over the data channels. Then each side would be able to use that information to establish the rendezvous to each other, using the passphrase info sent from the client side perhaps?

May 04 '22 15:05 jlsantiago0

If you think about having a signaling server in the form of SRT listener in order to allow both behind-NAT clients to connect to one another, then the only streamID information would be sent from the clients to the signaling server. Then a rendezvous connection established between the clients would have to use dedicated ports in one another and not shared with the connection to the signaling server.

[EDIT] Sorry I forgot: you'd have to use a real STUN query to the signaling server using the ports reserved for the rendezvous connection, not just read the "peername" from the connection to the signaling server. That would be the UDP port used next for SRT rendezvous connection, but it shouldn't be used to send any real data, except the STUN query. The client should read the STUN response and send it to the signaling server so that it can pass this information to the opposite client.

The passphrase should be either predefined for a username that both parties would use (or if they can use multiple, then the signaling server should tell them which one), or if it's to be random-generated by the signaling server, then the public key exchange may be additionally required so that the response is sent encrypted.

May 04 '22 15:05 ethouris

OK. I think I understand now. Thank you.

May 04 '22 16:05 jlsantiago0

srt srt copied to clipboard

[FR] AccessControl Decisions Outside of accept callback

srt
srt copied to clipboard