OvenMediaEngine
OvenMediaEngine copied to clipboard
pushes fail when one of the push destination times out
Discussed in https://github.com/AirenSoft/OvenMediaEngine/discussions/1572
Originally posted by vampirefrog March 31, 2024 I was streaming to about 12 pushes, when one of them went down (I mean the target site to which I was streaming, which is an OSP instance), and then all the other pushes started going down. Below you can see all I could find in the log.
Is there some kind of time out setting, or a setting to not take the other pushes down when one is stuck? Or is this a bug in OME? It seems to be similar to this: https://github.com/AirenSoft/OvenMediaEngine/issues/819
Mar 31 00:37:05 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:05.656] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 17430, threshold: 500, peak: 17430
Mar 31 00:37:10 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:10.668] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 17815, threshold: 500, peak: 17815
Mar 31 00:37:15 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:15.681] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 18201, threshold: 500, peak: 18201
Mar 31 00:37:20 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:20.693] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 18586, threshold: 500, peak: 18586
Mar 31 00:37:25 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:25.707] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 18971, threshold: 500, peak: 18971
Mar 31 00:37:30 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:30.742] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 19358, threshold: 500, peak: 19358
Mar 31 00:37:35 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:35.777] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 19746, threshold: 500, peak: 19746
Mar 31 00:37:40 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:40.811] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 20133, threshold: 500, peak: 20133
Mar 31 00:37:45 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:45.822] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 20518, threshold: 500, peak: 20518
Mar 31 00:37:50 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:50.838] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 20903, threshold: 500, peak: 20903
Mar 31 00:37:55 server3 OvenMediaEngine[2634930]: [2024-03-31 00:37:55.851] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 21289, threshold: 500, peak: 21289
Mar 31 00:38:00 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:00.865] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 21674, threshold: 500, peak: 21674
Mar 31 00:38:05 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:05.878] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 22060, threshold: 500, peak: 22060
Mar 31 00:38:10 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:10.891] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 22446, threshold: 500, peak: 22446
Mar 31 00:38:15 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:15.905] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 22830, threshold: 500, peak: 22830
Mar 31 00:38:20 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:20.918] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 23216, threshold: 500, peak: 23216
Mar 31 00:38:25 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:25.974] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 23604, threshold: 500, peak: 23604
Mar 31 00:38:30 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:30.990] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 23990, threshold: 500, peak: 23990
Mar 31 00:38:36 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:35.001] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 24375, threshold: 500, peak: 24375
Mar 31 00:38:41 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:41.015] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 24761, threshold: 500, peak: 24761
Mar 31 00:38:46 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:46.023] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 25146, threshold: 500, peak: 25146
Mar 31 00:38:51 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:51.042] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 25531, threshold: 500, peak: 25531
Mar 31 00:38:56 server3 OvenMediaEngine[2634930]: [2024-03-31 00:38:56.076] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 25918, threshold: 500, peak: 25918
Mar 31 00:39:01 server3 OvenMediaEngine[2634930]: [2024-03-31 00:39:01.090] W [AW-RTMPPush0:2634947] ManagedQueue | managed_queue.h:313 | [114] mngq:v=#default#live:s=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:p=pub:n=streamworker_rtmppush size has exceeded the threshold: queue: 26304, threshold: 500, peak: 26304
Mar 31 00:39:07 server3 systemd[1]: ovenmediaengine.service: A process of this unit has been killed by the OOM killer.
░░ Subject: A process of ovenmediaengine.service unit has been killed by the OOM killer.
```</div>
The destination (OSP) had an internet hickup, and connections would stall. So I'm assuming that the output queue in OME just got bigger and bigger till it crashed. Perhaps if the queue grows too big, OME should consider the connection ended and try to reconnect. From the logs it looks like it only warns when the threshold is reached, but takes no action. Perhaps there can be a warning threshold and a kill threshold. I don't know if it already does this but that's my two cents.
Hopefully this is resolved soon, one of the important features to us.
Can you post your Server.xml? I'm curious to see your settings.
I use this as a template on multiple servers:
<?xml version="1.0" encoding="UTF-8" ?>
<Server version="8">
<Name>OvenMediaEngine</Name>
<Type>origin</Type>
<IP>*</IP>
<PrivacyProtection>false</PrivacyProtection>
<Modules>
<HTTP2><Enable>true</Enable></HTTP2>
<LLHLS><Enable>true</Enable></LLHLS>
</Modules>
<Bind>
<Managers>
<API>
<Port>60080</Port>
<TLSPort>60443</TLSPort>
<WorkerCount>1</WorkerCount>
</API>
</Managers>
<Providers>
<RTMP>
<Port>61935</Port>
<WorkerCount>1</WorkerCount>
</RTMP>
<SRT>
<Port>60999</Port>
<WorkerCount>1</WorkerCount>
</SRT>
</Providers>
<Publishers>
<LLHLS>
<Port>60080</Port>
<TLSPort>60443</TLSPort>
<WorkerCount>1</WorkerCount>
</LLHLS>
</Publishers>
</Bind>
<Managers>
<Host>
<Names>
<Name>localhost:60080</Name>
</Names>
<TLS>
<CertPath>cert.pem</CertPath>
<KeyPath>key.pem</KeyPath>
<ChainCertPath>cert.pem</ChainCertPath>
</TLS>
</Host>
<API>
<AccessToken>poopy</AccessToken>
<CrossDomains><Url>*</Url></CrossDomains>
</API>
</Managers>
<VirtualHosts>
<VirtualHost>
<Name>default</Name>
<Distribution>LiveJoiner</Distribution>
<Host>
<Names><Name>*</Name></Names>
<TLS>
<CertPath>cert.pem</CertPath>
<KeyPath>key.pem</KeyPath>
<ChainCertPath>cert.pem</ChainCertPath>
</TLS>
</Host>
<Applications>
<Application>
<Name>live</Name>
<Type>live</Type>
<Providers><RTMP/><SRT/></Providers>
<Publishers><RTMPPush></RTMPPush><LLHLS><CrossDomains><Url>*</Url></CrossDomains></LLHLS></Publishers>
<OutputProfiles>
<OutputProfile>
<Name>bypass_stream</Name>
<OutputStreamName>${OriginStreamName}</OutputStreamName>
<Encodes>
<Audio>
<Name>bypass_audio</Name>
<Bypass>true</Bypass>
</Audio>
<Video>
<Name>bypass_video</Name>
<Bypass>true</Bypass>
</Video>
</Encodes>
</OutputProfile>
</OutputProfiles>
</Application>
</Applications>
<AdmissionWebhooks>
<ControlServerUrl>http://localhost:9999/webhook/25</ControlServerUrl>
<SecretKey>asdf</SecretKey>
<Timeout>3000</Timeout>
<Enables>
<Providers>rtmp,srt</Providers>
<Publishers></Publishers>
</Enables>
</AdmissionWebhooks>
<CrossDomains>
<Url>*</Url>
</CrossDomains>
</VirtualHost>
</VirtualHosts>
</Server>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@vampirefrog
Thank you for reporting. I think I know the cause of the issue you sent me. I'll put it in my improvement plan. I will contact you when it is completed.
Thanks, buddy. I also need to mention that the server is a $4 digital ocean droplet in SFO3 so you can probably test that way if that helps any.
Also, there is another issue here, it doesn't tell you which push is experiencing the issue, it just says that it's one of the pushes. Could you guys add the push ID in the log?
In fact, for me, the server admin, this would be very useful.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Y'all still working on this?I've found another issue where a long running server fills up memory and will not send some pushes, although they appear in the push list, and I'm not sure whether it's the same issue or not. After a restart it works.
@vampirefrog
I found a hang in a specific Push session and applied a Timeout to avoid it. I've patched the master branch. Please test it when you have time to see if the issue is fixed.
https://github.com/AirenSoft/OvenMediaEngine/commit/b53080a50fb9e877ee2cd30d26dd706c202d1264