owt-server
owt-server copied to clipboard
Timeout to make rpc to conference-7de8cd4d0d3bae78fa03@ipAddress_0.join
I am using OWT Server V5.0. Mongo DB, Rabbit and OWT all are on same machine
server connection failed: Error: Timeout to make rpc to [email protected]_0.join.
This error message comes every time. Login is unsuccesful. Portal send this join message but does not receive. And that is why timeout comes.
Need help, How to resolve this issue??
Where did you get compiled version of OWT Server V5.0? webrtc.intel.com only gives Release-v4.3.1 version. When I try to build V5.0 but get blocked at the stage of
`running 'download_from_google_storage --no_resume --platform=linux* --no_auth --bucket chromium-clang-format -s src/buildtools/linux64/clang-format.sha1' in '/root/owt-server-4.3/owt-server/third_party/webrtc-m79' 0> Downloading src/buildtools/linux64/clang-format@942fc8b1789144b8071d3fc03ff0fcbe1cf81ac8... Downloading 1 files took 8.500975 second(s)
Running hooks: 90% (20/22) msan_chained_origins `
I build from this https://github.com/open-webrtc-toolkit/owt-server/archive/refs/tags/v5.0.zip I was successfull without error. Running hooks: 90% --- this process takes time to finish. Please follow https://github.com/open-webrtc-toolkit/owt-server#instructions
Thx , the procedure just takes too much time and using too many memory. When I tested Release-v4.3.1 version I found a lot problems and hard to make it work. I am reading the whole source code to figure out how and why of the pipeline.
By the way, almost all the cloud server provider only using Intel CPU without Intel's GPU, and it is a problem to use Intel Media SDK to get hardware acceleration support.
I got the same issue as you did and don't know what to do with it. Every time when I tried to use the demo by 3004 port, I got the error "server connection failed: Error: Timeout to make rpc to [email protected]", so could you plz tell me how you solved it?
I also have a similar issue, if anyone knows how to solve it, I am interested.
Please check the file logs/conference-7de8cd4d0d3bae78fa03@ipAddress_0.log
to see if there were any errors.
Hello @starwarfan thanks for the advice. I did look into the logs but could not make much out of it. I attach the logs as well as my toml configuration. toml.zip logs.zip
There are a few timeouts in there, the root cause seems to be a node lost. I also looked in the logs from the cluster manager but could not find any indication of a failure there. Here is the first failure excerpt from the conference logs. `2022-08-01 15:19:52.846 - DEBUG: AmqpClient - remoteCall, corrID: 6 to: [email protected]_0 method: enableVAD
...
2022-08-01 15:19:54.501 - DEBUG: AmqpClient - remoteCall, corrID: 10 to: [email protected]_0 method: onTransportSignaling 2022-08-01 15:19:54.846 - DEBUG: AmqpClient - remoteCall timeout, corrID: 6 2022-08-01 15:19:55.681 - DEBUG: AmqpClient - received monitoring message: { reason: 'abnormal', message: { purpose: 'webrtc', id: '[email protected]_0', type: 'node' } } 2022-08-01 15:19:55.681 - DEBUG: RtcController - terminateByLocality node [email protected]_0 2022-08-01 15:19:55.681 - DEBUG: RtcController - terminate, sessionId: bb557ef7fc014deabccd551b13682ac2 direction: out, Node lost 2022-08-01 15:19:55.681 - DEBUG: AmqpClient - remoteCall, corrID: 11 to: [email protected]_0 method: unsubscribe 2022-08-01 15:19:56.502 - DEBUG: AmqpClient - remoteCall timeout, corrID: 10 2022-08-01 15:19:56.502 - WARN: RtcController - Trnasport signaling RPC failed Timeout to make rpc to [email protected]_0.onTransportSignaling 2022-08-01 15:19:57.682 - DEBUG: AmqpClient - remoteCall timeout, corrID: 11 2022-08-01 15:19:57.682 - DEBUG: Conference - onSessionAborted, participantId: JttcPkM6qW_n3-WMAAAA sessionId: bb557ef7fc014deabccd551b13682ac2 direction: out reason: Node lost`
After that, it will try to clean things up and more timeouts follow.
I have the same issue with v4.3.1. My solution is to change config:
#webrtc_agent/agent.toml
[webrtc]
network_interfaces = [] # before it was [{name = "eth0"}]
#portal/portal.toml
[portal]
ip_address = "" # before it was my server ip address
Then rerun init and start script and it will be ok. I guess that these configs let components use public ip to do rabbitmq messaging, which causes the problem, so I'm also afraid that when deploying in multi-nodes env this problem will happen again.