pjnath/turn_sock: Bad session state on destruction of TURN socket
Describe the bug
When a TURN socket gets destroyed because of an error -- e.g. `pj_sock_socket' returns EMFILE -- the underlying session's status is not updated.
This results in a bad state of a relayed candidate of a ICE transport session.
Steps to reproduce
- Make a ICE transport session with relayed candidates
- Ensure that
pj_sock_socketfails inturn_on_state. - See failed assertion
pj_assert(pj_sockaddr_has_addr(&cand->addr))in pjnath/ice_strans.c.
For 2., the easiest way is to do fuzzing -- i.e. random failure -- on the socket system call with LD_PRELOAD.
PJSIP version
2.11
Context
Possible fix:
From bc178c596c61b645d38523e6eca8c24085b05e4e Mon Sep 17 00:00:00 2001
From: Olivier Dion <[email protected]>
Date: Fri, 17 Jun 2022 16:56:07 -0400
Subject: [PATCH] pjnath/turn_sock: Fix destruction of TURN sockets
When destroying a TURN socket on failure -- e.g. failed allocation of
socket -- the underlying session does not change its last status.
Change this by passing a `pj_status_t' value to `pj_turn_sock_destroy2'
and calling `pj_turn_session_destroy' instead of
`pj_turn_session_shutdown'.
The old `pj_turn_sock_destroy' remains for ABI compatibility and call
the new version `pj_turn_sock_destroy2' with status `PJ_SUCCESS'.
---
pjnath/include/pjnath/turn_sock.h | 1 +
pjnath/src/pjnath/turn_sock.c | 21 +++++++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/pjnath/include/pjnath/turn_sock.h b/pjnath/include/pjnath/turn_sock.h
index 05b5cafb..424c0612 100644
--- a/pjnath/include/pjnath/turn_sock.h
+++ b/pjnath/include/pjnath/turn_sock.h
@@ -412,6 +412,7 @@ PJ_DECL(pj_status_t) pj_turn_sock_create(pj_stun_config *cfg,
* @param turn_sock The TURN transport instance.
*/
PJ_DECL(void) pj_turn_sock_destroy(pj_turn_sock *turn_sock);
+PJ_DECL(void) pj_turn_sock_destroy2(pj_turn_sock *turn_sock, pj_status_t status);
/**
diff --git a/pjnath/src/pjnath/turn_sock.c b/pjnath/src/pjnath/turn_sock.c
index 7a88494c..24a1515a 100644
--- a/pjnath/src/pjnath/turn_sock.c
+++ b/pjnath/src/pjnath/turn_sock.c
@@ -424,6 +424,11 @@ static void destroy(pj_turn_sock *turn_sock)
}
PJ_DEF(void) pj_turn_sock_destroy(pj_turn_sock *turn_sock)
+{
+ pj_turn_sock_destroy2(turn_sock, PJ_SUCCESS);
+}
+
+PJ_DEF(void) pj_turn_sock_destroy2(pj_turn_sock *turn_sock, pj_status_t status)
{
pj_grp_lock_acquire(turn_sock->grp_lock);
if (turn_sock->is_destroying) {
@@ -432,7 +437,7 @@ PJ_DEF(void) pj_turn_sock_destroy(pj_turn_sock *turn_sock)
}
if (turn_sock->sess) {
- pj_turn_session_shutdown(turn_sock->sess);
+ pj_turn_session_destroy(turn_sock->sess, status);
/* This will ultimately call our state callback, and when
* session state is DESTROYING we will schedule a timer to
* destroy ourselves.
@@ -1277,7 +1282,7 @@ static void turn_on_state(pj_turn_session *sess,
/* Init socket */
status = pj_sock_socket(turn_sock->af, sock_type, 0, &sock);
if (status != PJ_SUCCESS) {
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
@@ -1287,7 +1292,7 @@ static void turn_on_state(pj_turn_session *sess,
max_bind_retry);
if (status != PJ_SUCCESS) {
pj_sock_close(sock);
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
/* Apply QoS, if specified */
@@ -1298,7 +1303,7 @@ static void turn_on_state(pj_turn_session *sess,
if (status != PJ_SUCCESS && !turn_sock->setting.qos_ignore_error)
{
pj_sock_close(sock);
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
@@ -1417,7 +1422,7 @@ static void turn_on_state(pj_turn_session *sess,
&turn_sock->cert);
}
if (status != PJ_SUCCESS) {
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
if (turn_sock->cert) {
@@ -1428,7 +1433,7 @@ static void turn_on_state(pj_turn_session *sess,
&turn_sock->ssl_sock);
if (status != PJ_SUCCESS) {
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
@@ -1445,7 +1450,7 @@ static void turn_on_state(pj_turn_session *sess,
#endif
if (status != PJ_SUCCESS) {
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
@@ -1487,7 +1492,7 @@ static void turn_on_state(pj_turn_session *sess,
"Failed to connect to %s",
pj_sockaddr_print(&info.server, addrtxt,
sizeof(addrtxt), 3)));
- pj_turn_sock_destroy(turn_sock);
+ pj_turn_sock_destroy2(turn_sock, status);
return;
}
--
2.36.1
Log, call stack, etc
Nope.
@oldiob Would you be able to create a pull request for this, then pjproject team can review it.
@sauwming Could you please take a look?
I tested the provided scenario here with pjsua sample app using the latest version, with --use-ice and --use-turn and didn't manage to get the assertion.
Checking the code, when the TURN socket creation failed, it shouldn't be added as a candidate, so I'm puzzled as to how it could then trigger the assertion pj_sockaddr_has_addr(&cand->addr).
Perhaps you can test with the latest version, or provide us with the PJSIP log (level 5 or 6) so we can better understand the flow. Thanks.
I guess it only applied to us finally
I noticed you pushed a commit to your forked repo and if I could suggest a change in the patch is that I would add pj_turn_session_shutdown2(last_err) instead of pj_turn_sock_destroy2(). There are a couple of reasons for this:
-
pj_turn_session_destroy()haslast_errparameter, butpj_turn_session_shutdown()doesn't. - The current behaviour of
pj_turn_sock_destroy()doesn't change from graceful TURN session shutdown to forceful destroy.
I have created a patch last time when looking into this issue but didn't manage to test and verify if it works since I couldn't reproduce the problem, so didn't create a PR for it. Attached is the patch: turn_session_3154_fix.txt