srs icon indicating copy to clipboard operation
srs copied to clipboard

Edge: Crashed in remote mode + low_latency settings

Open vladimir131313 opened this issue 5 years ago • 3 comments

Description'

Please ensure that the markdown structure is maintained. I am using srs as Edge cluster + low_latency settings from here (https://github.com/ossrs/srs/wiki/v3_EN_LowLatency). I noticed that some players make hundreds short http connections to srs and it is crashed. I probed this issue and found that I can repeat it from my terminal. I started short CURL request (with one second timeout) to Flv streaming in cycle and srs is crashed. Usually it is crashed after ten requests.

For example:

for ((i=1;i<50;i++)); do echo -n $i; curl -v -s -o /dev/null -m 1 http://127.0.0.1:20180/test/test.flv 2>&1 | grep -E '(Failed)|(HTTP)' | grep -v GET; done
1< HTTP/1.1 200 OK
2< HTTP/1.1 200 OK
3< HTTP/1.1 200 OK
4< HTTP/1.1 200 OK
5< HTTP/1.1 200 OK
6< HTTP/1.1 200 OK
78* Failed to connect to 127.0.0.1 port 20180: Connection refused
9* Failed to connect to 127.0.0.1 port 20180: Connection refused
10* Failed to connect to 127.0.0.1 port 20180: Connection refused
...

*** abrt[13425]: Saved core dump of pid 13240 (srs/trunk/objs/srs) to /var/spool/abrt/ccpp-2020-05-13-16:44:11-13240 (2400256 bytes)

After some tests I found that problem in "mw_latency" directive. I used it for low latency streaming.

    play {
        gop_cache       off;
        queue_length    10;
        mw_latency      100;
    }

When I commented it my test has been successful.

    play {
        gop_cache       off;
        queue_length    10;
       # mw_latency      100;
    }
for ((i=1;i<50;i++)); do echo -n $i; curl -v -s -o /dev/null -m 1 http://127.0.0.1:20180/test/test.flv 2>&1 | grep -E '(Failed)|(HTTP)' | grep -v GET; done
1< HTTP/1.1 200 OK
2< HTTP/1.1 200 OK
3< HTTP/1.1 200 OK
4< HTTP/1.1 200 OK
...
48< HTTP/1.1 200 OK
49< HTTP/1.1 200 OK
  1. SRS version: 3.0.140
  2. The log of SRS is as follows: Please ensure that the markdown structure is maintained.

[2020-05-13 17:01:48.531][Trace][24091][522] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:48.531][Trace][24091][522] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:48.531][Trace][24091][522] http: mount flv stream for sid=/test/test, mount=/test/test.flv
[2020-05-13 17:01:48.531][Trace][24091][522] flv: source url=/test/test, is_edge=1, source_id=-1[-1]
[2020-05-13 17:01:48.531][Trace][24091][522] create consumer, active=0, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:48.531][Trace][24091][522] ignore disabled exec for vhost=__defaultVhost__
[2020-05-13 17:01:48.532][Trace][24091][522] set fd=11 TCP_NODELAY 0=>1
[2020-05-13 17:01:48.532][Trace][24091][522] set fd=11, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:48.532][Trace][24091][522] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:48.532][Trace][24091][522] update source_id=522[522]
[2020-05-13 17:01:48.540][Trace][24091][524] complex handshake success.
[2020-05-13 17:01:48.540][Trace][24091][524] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2020-05-13 17:01:48.620][Trace][24091][524] connected, version=3.0.140.0, ip=127.0.0.1, pid=11749, id=1181, dsu=1
[2020-05-13 17:01:48.620][Trace][24091][524] edge change from 100 to state 101 (pull).
[2020-05-13 17:01:48.621][Trace][24091][524] got metadata, width=424, height=240, vcodec=7, acodec=10
[2020-05-13 17:01:48.621][Trace][24091][524] 4B audio sh, codec(10, profile=LC, 2channels, 0kbps, 48000HZ), flv(16bits, 2channels, 44100HZ)
[2020-05-13 17:01:48.621][Trace][24091][524] 43B video sh,  codec(7, profile=Baseline, level=3, 432x240, 0kbps, 0.0fps, 0.0s)
[2020-05-13 17:01:48.631][Trace][24091][522] update source_id=524[524]
[2020-05-13 17:01:48.631][Trace][24091][522] FLV: write header audio=1, video=1
[2020-05-13 17:01:49.533][Warn][24091][524][4] origin disconnected, retry, error code=1007 : recv message : recv interlaced message : read basic header : basic header requires 1 bytes : read bytes : read
thread [24091][524]: ingest() [src/app/srs_app_edge.cpp:333][errno=4]
thread [24091][524]: recv_message() [src/protocol/srs_rtmp_stack.cpp:389][errno=4]
thread [24091][524]: recv_interlaced_message() [src/protocol/srs_rtmp_stack.cpp:871][errno=4]
thread [24091][524]: read_basic_header() [src/protocol/srs_rtmp_stack.cpp:966][errno=4]
thread [24091][524]: grow() [src/protocol/srs_protocol_stream.cpp:179][errno=4]
thread [24091][524]: read() [src/service/srs_service_st.cpp:490][errno=4]
[2020-05-13 17:01:49.539][Trace][24091][525] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:49.539][Trace][24091][525] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:49.539][Trace][24091][525] dispatch cached gop success. count=49, duration=716
[2020-05-13 17:01:49.539][Trace][24091][525] create consumer, active=1, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:49.539][Trace][24091][525] set fd=13 TCP_NODELAY 0=>1
[2020-05-13 17:01:49.539][Trace][24091][525] set fd=13, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:49.540][Trace][24091][525] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:49.540][Trace][24091][525] FLV: write header audio=1, video=1
[2020-05-13 17:01:50.541][Trace][24091][525] cleanup when unpublish
[2020-05-13 17:01:50.541][Trace][24091][525] edge change from 101 to state 0 (init).
[2020-05-13 17:01:50.541][Warn][24091][525][4] client disconnect peer. ret=1007
[2020-05-13 17:01:50.548][Trace][24091][526] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:50.548][Trace][24091][526] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:50.549][Trace][24091][526] flv: source url=/test/test, is_edge=1, source_id=-1[-1]
[2020-05-13 17:01:50.549][Trace][24091][526] create consumer, active=0, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:50.549][Trace][24091][526] ignore disabled exec for vhost=__defaultVhost__
[2020-05-13 17:01:50.549][Trace][24091][526] set fd=12 TCP_NODELAY 0=>1
[2020-05-13 17:01:50.549][Trace][24091][526] set fd=12, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:50.549][Trace][24091][526] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:50.549][Trace][24091][526] update source_id=526[526]
[2020-05-13 17:01:50.557][Trace][24091][527] complex handshake success.
[2020-05-13 17:01:50.557][Trace][24091][527] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2020-05-13 17:01:50.638][Trace][24091][527] connected, version=3.0.140.0, ip=127.0.0.1, pid=11749, id=1182, dsu=1
[2020-05-13 17:01:50.638][Trace][24091][527] edge change from 100 to state 101 (pull).
[2020-05-13 17:01:50.638][Trace][24091][527] got metadata, width=424, height=240, vcodec=7, acodec=10
[2020-05-13 17:01:50.638][Trace][24091][527] 4B audio sh, codec(10, profile=LC, 2channels, 0kbps, 48000HZ), flv(16bits, 2channels, 44100HZ)
[2020-05-13 17:01:50.638][Trace][24091][527] 43B video sh,  codec(7, profile=Baseline, level=3, 432x240, 0kbps, 0.0fps, 0.0s)
[2020-05-13 17:01:50.649][Trace][24091][526] update source_id=527[527]
[2020-05-13 17:01:50.649][Trace][24091][526] FLV: write header audio=1, video=1
[2020-05-13 17:01:51.550][Warn][24091][527][4] origin disconnected, retry, error code=1007 : recv message : recv interlaced message : read basic header : basic header requires 1 bytes : read bytes : read
thread [24091][527]: ingest() [src/app/srs_app_edge.cpp:333][errno=4]
thread [24091][527]: recv_message() [src/protocol/srs_rtmp_stack.cpp:389][errno=4]
thread [24091][527]: recv_interlaced_message() [src/protocol/srs_rtmp_stack.cpp:871][errno=4]
thread [24091][527]: read_basic_header() [src/protocol/srs_rtmp_stack.cpp:966][errno=4]
thread [24091][527]: grow() [src/protocol/srs_protocol_stream.cpp:179][errno=4]
thread [24091][527]: read() [src/service/srs_service_st.cpp:490][errno=4]
[2020-05-13 17:01:51.556][Trace][24091][528] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:51.557][Trace][24091][528] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:51.557][Trace][24091][528] dispatch cached gop success. count=62, duration=906
[2020-05-13 17:01:51.557][Trace][24091][528] create consumer, active=1, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:51.557][Trace][24091][528] set fd=14 TCP_NODELAY 0=>1
[2020-05-13 17:01:51.557][Trace][24091][528] set fd=14, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:51.557][Trace][24091][528] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:51.557][Trace][24091][528] FLV: write header audio=1, video=1
[2020-05-13 17:01:52.533][Trace][24091][522] cleanup when unpublish
[2020-05-13 17:01:52.533][Trace][24091][522] edge change from 101 to state 0 (init).
[2020-05-13 17:01:52.533][Warn][24091][522][4] client disconnect peer. ret=1007
[2020-05-13 17:01:52.558][Warn][24091][528][104] server disconnect. ret=4040
[2020-05-13 17:01:52.565][Trace][24091][529] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:52.565][Trace][24091][529] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:52.565][Trace][24091][529] flv: source url=/test/test, is_edge=1, source_id=-1[-1]
[2020-05-13 17:01:52.565][Trace][24091][529] create consumer, active=0, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:52.565][Trace][24091][529] ignore disabled exec for vhost=__defaultVhost__
[2020-05-13 17:01:52.565][Trace][24091][529] set fd=11 TCP_NODELAY 0=>1
[2020-05-13 17:01:52.565][Trace][24091][529] set fd=11, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:52.565][Trace][24091][529] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:52.565][Trace][24091][529] update source_id=529[529]
[2020-05-13 17:01:52.574][Trace][24091][530] complex handshake success.
[2020-05-13 17:01:52.574][Trace][24091][530] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2020-05-13 17:01:52.655][Trace][24091][530] connected, version=3.0.140.0, ip=127.0.0.1, pid=11749, id=1183, dsu=1
[2020-05-13 17:01:52.655][Trace][24091][530] edge change from 100 to state 101 (pull).
[2020-05-13 17:01:52.655][Trace][24091][530] got metadata, width=424, height=240, vcodec=7, acodec=10
[2020-05-13 17:01:52.655][Trace][24091][530] 4B audio sh, codec(10, profile=LC, 2channels, 0kbps, 48000HZ), flv(16bits, 2channels, 44100HZ)
[2020-05-13 17:01:52.655][Trace][24091][530] 43B video sh,  codec(7, profile=Baseline, level=3, 432x240, 0kbps, 0.0fps, 0.0s)
[2020-05-13 17:01:52.665][Trace][24091][529] update source_id=530[530]
[2020-05-13 17:01:52.665][Trace][24091][529] FLV: write header audio=1, video=1
[2020-05-13 17:01:53.574][Trace][24091][531] HTTP client ip=127.0.0.1, request=0, to=15000ms
[2020-05-13 17:01:53.574][Trace][24091][531] HTTP GET http://127.0.0.1:20180/test/test.flv, content-length=-1
[2020-05-13 17:01:53.574][Trace][24091][531] dispatch cached gop success. count=24, duration=369
[2020-05-13 17:01:53.574][Trace][24091][531] create consumer, active=1, queue_size=0.00, jitter=10000000
[2020-05-13 17:01:53.574][Trace][24091][531] set fd=14 TCP_NODELAY 0=>1
[2020-05-13 17:01:53.574][Trace][24091][531] set fd=14, SO_SNDBUF=660150=>50000, buffer=100ms
[2020-05-13 17:01:53.574][Trace][24091][531] FLV /test/test.flv, encoder=FLV, nodelay=1, mw_sleep=100ms, cache=0, msgs=128
[2020-05-13 17:01:53.574][Trace][24091][531] FLV: write header audio=1, video=1
[2020-05-13 17:01:53.666][Warn][24091][529][11] client disconnect peer. ret=1007

  1. The configuration of SRS is as follows: Please ensure that the markdown structure is maintained.
cat conf/srs.conf 
listen              20135;
max_connections     10000;
srs_log_tank        file;
srs_log_file        objs/logs/srs.log;
http_api {
    enabled         on;
    listen          20185;
}
http_server {
    enabled         on;
    listen          20180;
    dir             objs/nginx/html;
}
stats {
    network         0;
}

vhost __defaultVhost__ {
    tcp_nodelay     on
    min_latency     on;

    play {
        gop_cache       off;
        queue_length    10;
        mw_latency      100;
    }

    publish {
        mr off;
    }
    cluster {
        mode remote;
        origin 127.0.0.1:40135;
    }
    http_remux {
        enabled     on;
        mount       [vhost]/[app]/[stream].flv;
        hstrs       on;
    }
}


cat conf/srs-pub.conf 
# main config for srs.
# @see full.conf for detail config.

listen              40135;
pid                 objs/srs-pub.pid;
max_connections     10000;
srs_log_tank        file;
srs_log_file        objs/logs/srs-pub.log;
http_api {
    enabled         on;
    listen          40185;
}
stats {
    network         0;
}

vhost __defaultVhost__ {
    tcp_nodelay     on
    min_latency     on;

    play {
        gop_cache       off;
        queue_length    10;
        mw_latency      100;
    }

    publish {
        mr off;
    }
}

Replay Please ensure that the markdown structure is maintained.

  1. ./objs/srs -c conf/srs-pub.conf >/dev/null 2>&1
  2. ./objs/srs -c conf/srs.conf >/dev/null 2>&1
  3. ffmpeg -i INPUT -vcodec copy -acodec copy -f flv rtmp://127.0.0.1:40135/test/test
  4. for ((i=1;i<50;i++)); do echo -n $i; curl -v -s -o /dev/null -m 1 http://127.0.0.1:20180/test/test.flv 2>&1 | grep -E '(Failed)|(HTTP)' | grep -v GET; done

Expect Please ensure that the markdown structure is maintained. Is it possible fix it? Thanks.

TRANS_BY_GPT3

vladimir131313 avatar May 13 '20 14:05 vladimir131313

Check it.

winlinvip avatar Dec 01 '20 10:12 winlinvip

When viewer pull and stop stream in very short interval, it may cause the switching of coroutine and may occur problem like using freed object, during the reconnecting of Edge server. We need to reproduce this issue.

winlinvip avatar Apr 22 '24 00:04 winlinvip

A similar issue, see https://github.com/ossrs/srs/issues/1829

winlinvip avatar Apr 22 '24 00:04 winlinvip