SIGSEGV at 'RTC::WebRtcTransport::OnIceServerTupleRemoved'
Bug Report
It occurred 12 times in three days on ten servers.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000fffe19d71b4c in non-virtual thunk to RTC::WebRtcTransport::OnIceServerTupleRemoved(RTC::IceServer const*, RTC::TransportTuple*) () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
[Current thread is 1 (LWP 1882413)]
(gdb)
(gdb) bt
#0 0x0000fffe19d71b4c in non-virtual thunk to RTC::WebRtcTransport::OnIceServerTupleRemoved(RTC::IceServer const*, RTC::TransportTuple*) () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#1 0x0000fffe19d71e68 in non-virtual thunk to RTC::WebRtcTransport::OnRtcTcpConnectionClosed(RTC::TcpServer*, RTC::TcpConnection*) () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#2 0x0000fffe1a1561c4 in TcpServerHandler::OnTcpConnectionClosed(TcpConnectionHandler*) () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#3 0x0000fffe19ea547c in uv.read () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#4 0x0000fffe19ea5b30 in uv.stream_io () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#5 0x0000fffe19eac7bc in uv.io_poll () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#6 0x0000fffe19e9faf4 in uv_run () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#7 0x0000fffe19ceb34c in DepLibUV::RunLoop() () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#8 0x0000fffe19cf4608 in Worker::Worker(Channel::ChannelSocket*, PayloadChannel::PayloadChannelSocket*) () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#9 0x0000fffe19ce9b88 in mediasoup_worker_run () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#10 0x0000fffe19b8c55c in std::sys_common::backtrace::__rust_begin_short_backtrace () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#11 0x0000fffe19bc5754 in core::ops::function::FnOnce::call_once{{vtable-shim}} () from /opt/ovice_ex_core/ovice_ex_core/lib/mediasoup_elixir-0.4.1/priv/native/libmediasoup_elixir.so
#12 0x0000fffe1a20f654 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#13 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
#14 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:108
#15 0x0000ffff9f2733f0 in ?? ()
Your environment
- Operating system: Ubuntu 20.04.2
- Node version: N/A
- npm version: N/A
- gcc/clang version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
- mediasoup version: Rust version mediasoup: 0.10.0 mediasoup-sys: 0.4.2
- mediasoup-client version: 3.6.50
Issue description
I'll take a look when I have time, doesn't look good, I assume this is a regression in the recent version too?
Probably yes. The function that is crashing is coming from WebRtcServer PR.
Are you using WebRTC server or not yet?
Not yet, I just updated from 0.9.3 to 0.10.0 and changed the type ListenIP that causes build errors.
@satoren, please you show the result of bt full.
Probably doesn't help, as the C++ symbols seem to have been lost.
(gdb) bt full
#0 0x0000fffe09549b4c in non-virtual thunk to RTC::WebRtcTransport::OnIceServerTupleRemoved(RTC::IceServer const*, RTC::TransportTuple*) () at library/std/src/sync/once.rs:494
No symbol table info available.
#1 0x0000fffe09549e68 in non-virtual thunk to RTC::WebRtcTransport::OnRtcTcpConnectionClosed(RTC::TcpServer*, RTC::TcpConnection*) () at library/std/src/sync/once.rs:494
No symbol table info available.
#2 0x0000fffe0992e1c4 in TcpServerHandler::OnTcpConnectionClosed(TcpConnectionHandler*) () at library/std/src/sync/once.rs:494
No symbol table info available.
#3 0x0000fffe0967d47c in uv.read () at library/std/src/sync/once.rs:494
No symbol table info available.
#4 0x0000fffe0967db30 in uv.stream_io () at library/std/src/sync/once.rs:494
No symbol table info available.
#5 0x0000fffe096847bc in uv.io_poll () at library/std/src/sync/once.rs:494
No symbol table info available.
#6 0x0000fffe09677af4 in uv_run () at library/std/src/sync/once.rs:494
No symbol table info available.
#7 0x0000fffe094c334c in DepLibUV::RunLoop() () at library/std/src/sync/once.rs:494
No symbol table info available.
#8 0x0000fffe094cc608 in Worker::Worker(Channel::ChannelSocket*, PayloadChannel::PayloadChannelSocket*) () at library/std/src/sync/once.rs:494
No symbol table info available.
#9 0x0000fffe094c1b88 in mediasoup_worker_run () at library/std/src/sync/once.rs:494
No symbol table info available.
#10 0x0000fffe0936455c in std::sys_common::backtrace::__rust_begin_short_backtrace () at library/std/src/sync/once.rs:494
No symbol table info available.
#11 0x0000fffe0939d754 in core::ops::function::FnOnce::call_once{{vtable-shim}} () at library/std/src/sync/once.rs:494
No symbol table info available.
#12 0x0000fffe099e7654 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
No locals.
#13 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872
No locals.
#14 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:108
No locals.
#15 0x0000ffff8e3173f0 in ?? ()
Not yet, I just updated from 0.9.3 to 0.10.0 and changed the type ListenIP that causes build errors.
What is this exactly?
Probably doesn't help, as the C++ symbols seem to have been lost.
Since you are hitting the problem quite often, it would be helpful having the symbols for at least one of the dumps.
Not yet, I just updated from 0.9.3 to 0.10.0 and changed the type ListenIP that causes build errors.
I have no idea what you mean here.
I meant to say that I am not using the new features (WebRTCServer, etc.) in the version update. The Rust version had a slight type change, so the build did not pass as is.
So you are using Rust, you are not using WebRtcServer at all and you changed something in the Rust code. And it crashes due to something in WebRtcServer (that you are not even using) when a ICE TCP connection is closed. How can it be? What change did you do in Rust side and why?
The only change he did was related to code compilation, some arguments were refactored, library code didn't change too much IIRC, but I may check later again.
Looks like there might be some regression either in Rust library or worker that happens when WebRTC server isn't used.
Now I realize that it is RTC::WebRtcTransport::OnIceServerTupleRemoved() the one crashing rather than in WebRtcServer...
Maybe the reason is using freed memory

There is no freed memory usage in there. this->tuples.erase(it) just removes a TransportTuple pointer from a map, it doesn't free it.
There is no freed memory usage in there.
this->tuples.erase(it)just removes aTransportTuplepointer from a map, it doesn't free it.
But this->tuples is std::list<RTC::TransportTuple>, not std::map<RTC::TransportTuple*>
GOOD POINT!
@ybybwdwd please take a look: https://github.com/versatica/mediasoup/pull/915