srs
srs copied to clipboard
CLUSTER: 向源站集群内某节点拉流,断开后3秒内再次拉流报错
注意:提问前,请先看FAQ(Please read FAQ before file an issue) https://github.com/ossrs/srs/issues/2716
描述(Description)
描述你遇到了什么问题(Please description your issue here)
现有源站集群3个节点,A、B、C和一个edge边缘拉流节点。C节点上有一路流在推。 当播放器拉edge边缘节点的流时,第一次正常播放,停止播放后立刻、快速再次拉流,无法播放,SRS日志报错。 但当播放器拉源站集群中的C节点的流时,正常播放,停止播放后立刻、快速再次拉流,却能正常播放,无问题。重复多少次都行。
经过不断的测试发现,播放器在向edge边缘节点拉流时,正常播放且断开后的3秒内要想重新拉流,就会造成拉流失败的问题。
最新版的srs-server-4.0-b6经过测试也会出现这个问题。
经过查询issues,以为是#1520,仔细阅读后发现不是。(https://github.com/ossrs/srs/issues/1520)
报错如下:
[2022-02-07 16:34:54.525][Error][109157][t4269891][11] serve error code=2020 : service cycle : rtmp: stream service : rtmp: create consumer : play edge : state is stopping
从报错内容来看,好像是当前的edge节点正在做停止操作,不让拉流。
-
SRS版本(Version):
srs-server-4.0-b1(4.0.206) -
操作系统:
ubuntu-16.04.6-server-amd64 -
SRS安装包:
srs-server-4.0-b1.tar.gz -
SRS的edge节点日志如下(Log):
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] XCORE-SRS/4.0.206(Leo)
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] config parse complete
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] write log to console
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3][MAIN] SRS/4.0.206(Leo), MIT
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] authors: https://github.com/ossrs/srs/blob/4.0release/trunk/AUTHORS.txt
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] contributors: Winlin<[email protected]> Winlin<[email protected]> Winlin<[email protected]> Wenjie<[email protected]> Wenjie<[email protected]> Wenjie<[email protected]> xiangcheng.liu<[email protected]> naijia.liu<[email protected]> alcoholyi<[email protected]> byteman<[email protected]> chad.wang<[email protected]> suhetao<[email protected]> Johnny<[email protected]> karthikeyan<[email protected]> StevenLiu<[email protected]> zhengfl<[email protected]> tufang14<[email protected]> allspace<[email protected]> niesongsong<[email protected]> rudeb0t<[email protected]> CallMeNP<[email protected]> synote<[email protected]> lovecat<[email protected]> panda1986<[email protected]> YueHonghui<[email protected]> ThomasDreibholz<[email protected]> JuntaoLiu<[email protected]> RocFang<[email protected]> MakarovYaroslav<[email protected]> MirkoVelic<[email protected]> HuiZhang(huzhang2)<[email protected]> OtterWa<[email protected]> walkermi<[email protected]> haofz<[email protected]> ME_Kun_Han<[email protected]> ljx0305<[email protected]> cenxinwei<[email protected]> StarBrilliant<[email protected]> xubin<[email protected]> intliang<[email protected]> flowerwrong<[email protected]> YLX<[email protected]> J<[email protected]> Harlan<[email protected]> hankun<[email protected]> JonathanBarratt<[email protected]> KeeganH<[email protected]> StevenLiu<[email protected]> liuxc0116<[email protected]> ChengdongZhang<[email protected]> lovacat<[email protected]> qiang.li<[email protected]> HungMingWu<[email protected]> Himer<[email protected]> XiaLixin<[email protected]> XiaLixin<[email protected]> XiaLixin<[email protected]> XiaLixin<[email protected]> alphonsetai<[email protected]> Michael.Ma<[email protected]> lam2003<[email protected]> ShiWei<[email protected]> ShiWei<[email protected]> XiaofengWang<[email protected]> XiaoZhihong<[email protected]> XiaoZhihong<[email protected]> XiaoZhihong<[email protected]> yanghuiwen<[email protected]> WuPengqiang<[email protected]> WuPengqiang<[email protected]> l<22312935+lam2003@github> xfalcon<x-falcon@github> ChenGuanghua<[email protected]> ChenGuanghua<[email protected]> LiPeng<[email protected]> LiPeng<[email protected]> yajun18<[email protected]> liulichuan<[email protected]> yapingcat<[email protected]> chenchengbin<[email protected]> ChenHaibo<[email protected]> ChenHaibo<[email protected]> jasongwq<[email protected]> yinjiaoyuan<[email protected]> PieerePi<[email protected]> JesseXi<[email protected]> PieerePi<40780488+PieerePi@github> ghostsf<[email protected]> xbpeng121<53243357+xbpeng121@github> johzzy<[email protected]> stone<[email protected]> cfw11<34058899+cfw11@github> Hung-YiChen<[email protected]> long<[email protected]> matthew1838<77285055+matthew1838@github> rise<[email protected]>
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] cwd=/home/wsm/srs-server-4.0-b1/trunk, work_dir=./, build: 2021-12-20 20:18:37, configure: --x86-x64, uname: Linux ubuntu 4.4.0-142-generic #168-Ubuntu SMP Wed Jan 16 21:00:45 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux, osx: 0
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] configure detail: --prefix=/usr/local/srs --hls=on --hds=off --dvr=on --ssl=on --https=on --ssl-1-0=off --ssl-local=off --sys-ssl=off --transcode=on --ingest=on --stat=on --http-callback=on --http-server=on --stream-caster=on --http-api=on --utest=off --cherrypy=off --srt=off --rtc=on --simulator=off --cxx11=off --cxx14=off --ffmpeg-fit=on --nasm=on --srtp-nasm=on --clean=on --gperf=off --gmc=off --gmd=off --gmp=off --gcp=off --gprof=off --static=off --shared-st=off --shared-srt=off --shared-ffmpeg=off --log-verbose=off --log-info=off --log-trace=on --gcov=off --debug=off --debug-stats=off --cross-build=off --cc=gcc --cxx=g++ --ar=ar --ld=ld --randlib=randlib
[2022-02-07 16:34:29.058][Trace][109157][jp1i16y3] srs checking config...
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] ips, iface[0] bond0 ipv4 0x11443 192.168.1.185, iface[1] bond0 ipv6 0x11443 fe80::1618:77ff:fe4d:57f%bond0
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] devices, intranet bond0 192.168.1.185, intranet bond0 fe80::1618:77ff:fe4d:57f%bond0
[2022-02-07 16:34:29.059][Warn][109157][jp1i16y3][22] stats network use index=0, ip=192.168.1.185, ifname=bond0
[2022-02-07 16:34:29.059][Warn][109157][jp1i16y3][22] stats disk not configed, disk iops disabled.
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] write log to console
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] features, rch:on, dash:on, hls:on, hds:off, srt:off, hc:on, ha:on, hs:on, hp:on, dvr:on, trans:on, inge:on, stat:on, sc:on
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] SRS on amd64 x86_64, conf:./conf/origin.cluster.edge.conf, limit:1000, writev:1024, encoding:little-endian, HZ:100
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] mw sleep:350ms. mr enabled:on, default:0, sleep:350ms
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] gc:on, pq:30000ms, cscc:[0,16), csa:on, tn:on(may hurts performance), ss:auto(guess by merged write)
[2022-02-07 16:34:29.059][Trace][109157][jp1i16y3] system default latency(ms): mw(0-350) + mr(0-350) + play-queue(0-30000)
[2022-02-07 16:34:29.059][Warn][109157][jp1i16y3][22] SRS/4.0.206 is not stable
[2022-02-07 16:34:29.066][Trace][109157][jp1i16y3] fingerprint=C7:56:04:AE:A3:19:FD:1B:D3:2B:AD:B3:2E:A9:BF:6C:6E:6E:7E:12:C9:5F:D9:06:CF:5F:A1:7D:04:FB:A7:80
[2022-02-07 16:34:29.066][Trace][109157][jp1i16y3] CircuitBreaker: enabled=1, high=2x90, critical=1x95, dying=5x99
[2022-02-07 16:34:29.066][Trace][109157][jp1i16y3] http: root mount to ./objs/nginx/html
[2022-02-07 16:34:29.066][Trace][109157][jp1i16y3] server main cid=jp1i16y3, pid=109157, ppid=53212, asprocess=0
[2022-02-07 16:34:29.067][Trace][109157][jp1i16y3] write pid=109157 to objs/edge.pid success!
[2022-02-07 16:34:29.067][Trace][109157][jp1i16y3] RTMP listen at tcp://0.0.0.0:1935, fd=6
[2022-02-07 16:34:29.067][Trace][109157][jp1i16y3] signal installed, reload=1, reopen=10, fast_quit=15, grace_quit=3
[2022-02-07 16:34:29.067][Trace][109157][jp1i16y3] http: api mount /console to ./objs/nginx/html/console
[2022-02-07 16:34:29.068][Trace][109157][4s1606j8] Hybrid cpu=0.00%,11MB
[2022-02-07 16:34:29.068][Warn][109157][z993989r][22] use private address as ip: 192.168.1.185, ifname=bond0
[2022-02-07 16:34:29.068][Trace][109157][z993989r] Startup query id=c3ed94bc87f011ec84a7cfcef7e51d2e, eip=192.168.1.185, wait=312s
[2022-02-07 16:34:29.068][Trace][109157][mp510936] TCP: connection manager run, conns=0
[2022-02-07 16:34:29.069][Trace][109157][0tz9q447] RTC: connection manager run, conns=0
[2022-02-07 16:34:34.055][Trace][109157][4s1606j8] Hybrid cpu=0.00%,11MB
[2022-02-07 16:34:39.055][Trace][109157][4s1606j8] Hybrid cpu=1.00%,11MB
[2022-02-07 16:34:44.055][Trace][109157][4s1606j8] Hybrid cpu=1.00%,11MB, cid=1,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0
[2022-02-07 16:34:49.055][Trace][109157][4s1606j8] Hybrid cpu=0.00%,11MB, cid=1,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0
[2022-02-07 16:34:50.965][Trace][109157][55u9592w] RTMP client ip=192.168.1.210:62588, fd=7
[2022-02-07 16:34:50.970][Trace][109157][55u9592w] complex handshake success
[2022-02-07 16:34:50.971][Trace][109157][55u9592w] connect app, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=192.168.1.185, port=1935, app=live, args=null
[2022-02-07 16:34:50.971][Trace][109157][55u9592w] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:51.052][Trace][109157][55u9592w] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] client identified, type=rtmp-play, vhost=192.168.1.185, app=live, stream=test01, param=, duration=-1ms
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] connected stream, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=test01, param=, args=null
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] new source, stream_url=/live/test01
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] source url=/live/test01, ip=192.168.1.210, cache=1, is_edge=1, source_id=/
[2022-02-07 16:34:51.093][Trace][109157][55u9592w] ignore disabled exec for vhost=__defaultVhost__
[2022-02-07 16:34:51.094][Trace][109157][55u9592w] dispatch cached gop success. count=0, duration=-1
[2022-02-07 16:34:51.094][Trace][109157][55u9592w] create consumer, active=1, queue_size=0.00, jitter=30000000
[2022-02-07 16:34:51.094][Trace][109157][55u9592w] set fd=7, SO_SNDBUF=87040=>175000, buffer=350ms
[2022-02-07 16:34:51.094][Trace][109157][55u9592w] start play smi=0ms, mw_sleep=350, mw_msgs=8, realtime=0, tcp_nodelay=0
[2022-02-07 16:34:51.094][Trace][109157][55u9592w] update source_id=ueeyh653/55u9592w
[2022-02-07 16:34:51.096][Trace][109157][ueeyh653] complex handshake success.
[2022-02-07 16:34:51.097][Trace][109157][ueeyh653] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:51.176][Trace][109157][ueeyh653] connected, version=4.0.206.0, ip=127.0.0.1, pid=109153, id=0, dsu=1
[2022-02-07 16:34:51.176][Trace][109157][ueeyh653] edge-pull publish url rtmp://127.0.0.1:19351/live/test01, stream=test01 as test01
[2022-02-07 16:34:51.176][Trace][109157][ueeyh653] edge change from 100 to state 101 (pull).
[2022-02-07 16:34:51.216][Warn][109157][ueeyh653][11] RTMP redirect /live/test01 from 127.0.0.1:19351 to rtmp://127.0.0.1:19352/live/test01
[2022-02-07 16:34:51.219][Trace][109157][ueeyh653] complex handshake success.
[2022-02-07 16:34:51.219][Trace][109157][ueeyh653] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:51.296][Trace][109157][ueeyh653] connected, version=4.0.206.0, ip=127.0.0.1, pid=109154, id=0, dsu=1
[2022-02-07 16:34:51.296][Trace][109157][ueeyh653] edge-pull publish url rtmp://127.0.0.1:19352/live/test01, stream=test01 as test01
[2022-02-07 16:34:51.297][Trace][109157][ueeyh653] got metadata, width=1920, height=1080, vcodec=7, acodec=7
[2022-02-07 16:34:51.297][Trace][109157][ueeyh653] 42B video sh, codec(7, profile=High, level=4, 1920x1080, 0kbps, 0.0fps, 0.0s)
[2022-02-07 16:34:53.129][Warn][109157][ueeyh653][4] origin disconnected, retry, error code=1007 : recv message : recv interlaced message : read basic header : basic header requires 1 bytes : read bytes : read
thread [109157][ueeyh653]: ingest() [src/app/srs_app_edge.cpp:320][errno=4]
thread [109157][ueeyh653]: recv_message() [src/protocol/srs_rtmp_stack.cpp:372][errno=4]
thread [109157][ueeyh653]: recv_interlaced_message() [src/protocol/srs_rtmp_stack.cpp:859][errno=4]
thread [109157][ueeyh653]: read_basic_header() [src/protocol/srs_rtmp_stack.cpp:954][errno=4]
thread [109157][ueeyh653]: grow() [src/protocol/srs_protocol_stream.cpp:162][errno=4]
thread [109157][ueeyh653]: read() [src/protocol/srs_service_st.cpp:514][errno=4]
[2022-02-07 16:34:54.056][Trace][109157][4s1606j8] Hybrid cpu=0.00%,14MB, cid=1,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0
[2022-02-07 16:34:54.392][Trace][109157][t4269891] RTMP client ip=192.168.1.210:62591, fd=9
[2022-02-07 16:34:54.406][Trace][109157][t4269891] complex handshake success
[2022-02-07 16:34:54.406][Trace][109157][t4269891] connect app, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=192.168.1.185, port=1935, app=live, args=null
[2022-02-07 16:34:54.406][Trace][109157][t4269891] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:54.484][Trace][109157][t4269891] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:54.524][Trace][109157][t4269891] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:54.524][Trace][109157][t4269891] client identified, type=rtmp-play, vhost=192.168.1.185, app=live, stream=test01, param=, duration=-1ms
[2022-02-07 16:34:54.524][Trace][109157][t4269891] connected stream, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=test01, param=, args=null
[2022-02-07 16:34:54.524][Trace][109157][t4269891] source url=/live/test01, ip=192.168.1.210, cache=1, is_edge=1, source_id=ueeyh653/55u9592w
[2022-02-07 16:34:54.525][Trace][109157][t4269891] TCP: before dispose resource(RtmpConn)(0x13f4090), conns=2, zombies=0, ign=0, inz=0, ind=0
[2022-02-07 16:34:54.525][Error][109157][t4269891][11] serve error code=2020 : service cycle : rtmp: stream service : rtmp: create consumer : play edge : state is stopping
thread [109157][t4269891]: do_cycle() [src/app/srs_app_rtmp_conn.cpp:217][errno=11]
thread [109157][t4269891]: service_cycle() [src/app/srs_app_rtmp_conn.cpp:414][errno=11]
thread [109157][t4269891]: playing() [src/app/srs_app_rtmp_conn.cpp:659][errno=11]
thread [109157][t4269891]: create_consumer() [src/app/srs_app_source.cpp:2582][errno=11]
thread [109157][t4269891]: on_client_play() [src/app/srs_app_edge.cpp:674][errno=11](Resource temporarily unavailable)
[2022-02-07 16:34:54.525][Trace][109157][mp510936] TCP: clear zombies=1 resources, conns=2, removing=0, unsubs=0
[2022-02-07 16:34:54.525][Trace][109157][t4269891] TCP: disposing #0 resource(RtmpConn)(0x13f4090), conns=2, disposing=1, zombies=0
[2022-02-07 16:34:55.340][Trace][109157][h94o04l7] RTMP client ip=192.168.1.210:62592, fd=9
[2022-02-07 16:34:55.355][Trace][109157][h94o04l7] complex handshake success
[2022-02-07 16:34:55.356][Trace][109157][h94o04l7] connect app, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=192.168.1.185, port=1935, app=live, args=null
[2022-02-07 16:34:55.356][Trace][109157][h94o04l7] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:55.436][Trace][109157][h94o04l7] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:55.476][Trace][109157][h94o04l7] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:55.476][Trace][109157][h94o04l7] client identified, type=rtmp-play, vhost=192.168.1.185, app=live, stream=test01, param=, duration=-1ms
[2022-02-07 16:34:55.476][Trace][109157][h94o04l7] connected stream, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=test01, param=, args=null
[2022-02-07 16:34:55.476][Trace][109157][h94o04l7] source url=/live/test01, ip=192.168.1.210, cache=1, is_edge=1, source_id=ueeyh653/55u9592w
[2022-02-07 16:34:55.477][Trace][109157][h94o04l7] TCP: before dispose resource(RtmpConn)(0x13f3a00), conns=2, zombies=0, ign=0, inz=0, ind=0
[2022-02-07 16:34:55.477][Error][109157][h94o04l7][11] serve error code=2020 : service cycle : rtmp: stream service : rtmp: create consumer : play edge : state is stopping
thread [109157][h94o04l7]: do_cycle() [src/app/srs_app_rtmp_conn.cpp:217][errno=11]
thread [109157][h94o04l7]: service_cycle() [src/app/srs_app_rtmp_conn.cpp:414][errno=11]
thread [109157][h94o04l7]: playing() [src/app/srs_app_rtmp_conn.cpp:659][errno=11]
thread [109157][h94o04l7]: create_consumer() [src/app/srs_app_source.cpp:2582][errno=11]
thread [109157][h94o04l7]: on_client_play() [src/app/srs_app_edge.cpp:674][errno=11](Resource temporarily unavailable)
[2022-02-07 16:34:55.477][Trace][109157][mp510936] TCP: clear zombies=1 resources, conns=2, removing=0, unsubs=0
[2022-02-07 16:34:55.477][Trace][109157][h94o04l7] TCP: disposing #0 resource(RtmpConn)(0x13f3a00), conns=2, disposing=1, zombies=0
[2022-02-07 16:34:56.129][Trace][109157][55u9592w] cleanup when unpublish
[2022-02-07 16:34:56.129][Trace][109157][55u9592w] edge change from 101 to 300 then 0 (init).
[2022-02-07 16:34:56.129][Trace][109157][55u9592w] TCP: before dispose resource(RtmpConn)(0x13656d0), conns=1, zombies=0, ign=0, inz=0, ind=0
[2022-02-07 16:34:56.129][Warn][109157][55u9592w][4] client disconnect peer. ret=1007
[2022-02-07 16:34:56.129][Trace][109157][mp510936] TCP: clear zombies=1 resources, conns=1, removing=0, unsubs=0
[2022-02-07 16:34:56.129][Trace][109157][55u9592w] TCP: disposing #0 resource(RtmpConn)(0x13656d0), conns=1, disposing=1, zombies=0
[2022-02-07 16:34:56.180][Trace][109157][5rz6t681] RTMP client ip=192.168.1.210:62595, fd=7
[2022-02-07 16:34:56.184][Trace][109157][5rz6t681] complex handshake success
[2022-02-07 16:34:56.184][Trace][109157][5rz6t681] connect app, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=192.168.1.185, port=1935, app=live, args=null
[2022-02-07 16:34:56.184][Trace][109157][5rz6t681] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:56.266][Trace][109157][5rz6t681] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] ignore AMF0/AMF3 command message.
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] client identified, type=rtmp-play, vhost=192.168.1.185, app=live, stream=test01, param=, duration=-1ms
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] connected stream, tcUrl=rtmp://192.168.1.185:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=test01, param=, args=null
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] source url=/live/test01, ip=192.168.1.210, cache=1, is_edge=1, source_id=/ueeyh653
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] ignore disabled exec for vhost=__defaultVhost__
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] dispatch cached gop success. count=0, duration=-1
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] create consumer, active=1, queue_size=0.00, jitter=30000000
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] set fd=7, SO_SNDBUF=87040=>175000, buffer=350ms
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] start play smi=0ms, mw_sleep=350, mw_msgs=8, realtime=0, tcp_nodelay=0
[2022-02-07 16:34:56.305][Trace][109157][5rz6t681] update source_id=rh46sb82/ueeyh653
[2022-02-07 16:34:56.308][Trace][109157][rh46sb82] complex handshake success.
[2022-02-07 16:34:56.308][Trace][109157][rh46sb82] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:56.384][Trace][109157][rh46sb82] connected, version=4.0.206.0, ip=127.0.0.1, pid=109153, id=0, dsu=1
[2022-02-07 16:34:56.384][Trace][109157][rh46sb82] edge-pull publish url rtmp://127.0.0.1:19351/live/test01, stream=test01 as test01
[2022-02-07 16:34:56.384][Trace][109157][rh46sb82] edge change from 100 to state 101 (pull).
[2022-02-07 16:34:56.424][Warn][109157][rh46sb82][11] RTMP redirect /live/test01 from 127.0.0.1:19351 to rtmp://127.0.0.1:19352/live/test01
[2022-02-07 16:34:56.426][Trace][109157][rh46sb82] complex handshake success.
[2022-02-07 16:34:56.427][Trace][109157][rh46sb82] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-02-07 16:34:56.504][Trace][109157][rh46sb82] connected, version=4.0.206.0, ip=127.0.0.1, pid=109154, id=0, dsu=1
[2022-02-07 16:34:56.504][Trace][109157][rh46sb82] edge-pull publish url rtmp://127.0.0.1:19352/live/test01, stream=test01 as test01
[2022-02-07 16:34:56.505][Trace][109157][rh46sb82] got metadata, width=1920, height=1080, vcodec=7, acodec=7
[2022-02-07 16:34:56.505][Trace][109157][rh46sb82] 42B video sh, codec(7, profile=High, level=4, 1920x1080, 0kbps, 0.0fps, 0.0s)
[2022-02-07 16:34:58.830][Warn][109157][rh46sb82][4] origin disconnected, retry, error code=1007 : recv message : recv interlaced message : read basic header : basic header requires 1 bytes : read bytes : read
thread [109157][rh46sb82]: ingest() [src/app/srs_app_edge.cpp:320][errno=4]
thread [109157][rh46sb82]: recv_message() [src/protocol/srs_rtmp_stack.cpp:372][errno=4]
thread [109157][rh46sb82]: recv_interlaced_message() [src/protocol/srs_rtmp_stack.cpp:859][errno=4]
thread [109157][rh46sb82]: read_basic_header() [src/protocol/srs_rtmp_stack.cpp:954][errno=4]
thread [109157][rh46sb82]: grow() [src/protocol/srs_protocol_stream.cpp:162][errno=4]
thread [109157][rh46sb82]: read() [src/protocol/srs_service_st.cpp:514][errno=4]
[2022-02-07 16:34:59.056][Trace][109157][4s1606j8] Hybrid cpu=1.00%,14MB, cid=11,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:60,oth:0,buf:0)
[2022-02-07 16:35:01.830][Trace][109157][5rz6t681] cleanup when unpublish
[2022-02-07 16:35:01.830][Trace][109157][5rz6t681] edge change from 101 to 300 then 0 (init).
[2022-02-07 16:35:01.830][Trace][109157][5rz6t681] TCP: before dispose resource(RtmpConn)(0x13656d0), conns=1, zombies=0, ign=0, inz=0, ind=0
[2022-02-07 16:35:01.830][Warn][109157][5rz6t681][4] client disconnect peer. ret=1007
[2022-02-07 16:35:01.830][Trace][109157][mp510936] TCP: clear zombies=1 resources, conns=1, removing=0, unsubs=0
[2022-02-07 16:35:01.830][Trace][109157][5rz6t681] TCP: disposing #0 resource(RtmpConn)(0x13656d0), conns=1, disposing=1, zombies=0
[2022-02-07 16:35:04.056][Trace][109157][4s1606j8] Hybrid cpu=1.00%,14MB, cid=11,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:60,oth:0,buf:0)
[2022-02-07 16:35:09.056][Trace][109157][4s1606j8] Hybrid cpu=0.00%,14MB, cid=11,1, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:60,oth:0,buf:0)
^C[2022-02-07 16:35:09.424][Trace][109157][jp1i16y3] sig=2, user terminate program, fast quit
[2022-02-07 16:35:10.060][Trace][109157][jp1i16y3] cleanup for quit signal fast=1, grace=0
[2022-02-07 16:35:10.060][Warn][109157][jp1i16y3][11] main cycle terminated, system quit normally.
[2022-02-07 16:35:10.160][Trace][109157][jp1i16y3] srs disposed
[2022-02-07 16:35:10.160][Trace][109157][jp1i16y3] srs terminated
- SRS的配置如下(Config):
origin.cluster.serverA.conf
# the config for srs origin-origin cluster
# @see https://github.com/ossrs/srs/wiki/v3_EN_OriginCluster
# @see full.conf for detail config.
listen 19350;
max_connections 1000;
daemon off;
srs_log_tank console;
pid ./objs/origin.cluster.serverA.pid;
http_api {
enabled on;
listen 9090;
}
vhost __defaultVhost__ {
cluster {
mode local;
origin_cluster on;
coworkers 127.0.0.1:9091 127.0.0.1:9092;
}
}
origin.cluster.serverB.conf
# the config for srs origin-origin cluster
# @see https://github.com/ossrs/srs/wiki/v3_EN_OriginCluster
# @see full.conf for detail config.
listen 19351;
max_connections 1000;
daemon off;
srs_log_tank console;
pid ./objs/origin.cluster.serverB.pid;
http_api {
enabled on;
listen 9091;
}
vhost __defaultVhost__ {
cluster {
mode local;
origin_cluster on;
coworkers 127.0.0.1:9090 127.0.0.1:9092;
}
}
origin.cluster.serverC.conf
# the config for srs origin-origin cluster
# @see https://github.com/ossrs/srs/wiki/v3_EN_OriginCluster
# @see full.conf for detail config.
listen 19352;
max_connections 1000;
daemon off;
srs_log_tank console;
pid ./objs/origin.cluster.serverC.pid;
http_api {
enabled on;
listen 9092;
}
vhost __defaultVhost__ {
cluster {
mode local;
origin_cluster on;
coworkers 127.0.0.1:9090 127.0.0.1:9091;
}
}
origin.cluster.edge.conf
# the config for srs origin-edge cluster
# @see https://github.com/ossrs/srs/wiki/v3_EN_OriginCluster
# @see full.conf for detail config.
listen 1935;
max_connections 1000;
pid objs/edge.pid;
daemon off;
srs_log_tank console;
vhost __defaultVhost__ {
cluster {
mode remote;
origin 127.0.0.1:19351 127.0.0.1:19350;
}
}
重现(Replay)
重现Bug的步骤(How to replay bug?)
- 开一个shell窗口,进入到SRS的解压目录,执行
./objs/srs -c ./conf/origin.cluster.serverA.conf - 再开一个shell窗口,进入到SRS的解压目录,执行
./objs/srs -c ./conf/origin.cluster.serverB.conf - 再开一个shell窗口,进入到SRS的解压目录,执行
./objs/srs -c ./conf/origin.cluster.serverC.conf - 再开一个shell窗口,进入到SRS的解压目录,执行
./objs/srs -c ./conf/origin.cluster.edge.conf - 本地电脑打开CMD,用FFmpeg向 serverC节点 推流,
ffmpeg -i rtsp://admin:[email protected]:554/h264/ch1/main/av_stream -c copy -f flv rtmp://192.168.1.185:19352/live/test01 - 本地电脑打开CMD,用ffplay向 edge节点 拉流并播放,
ffplay -i rtmp://192.168.1.185:1935/live/test01 - 正常播放后,停止拉流,此时,3秒内再次重复步骤6中的拉流动作,就能复现问题,SRS的日志会报错(步骤6中选择用ffplay拉流的原因是能够在短时间内重复拉流动作,VLC操作就会慢一些)
期望行为(Expect)
描述你期望发生的事情(Please describe your expectation)
期望在源站集群环境下,edge拉流节点拉流时也能像原来单体部署的节点拉流那样,无论何时拉流,都能拉流成功(推流在的前提下)。
(Business Scenario Supplement:
- There is a video preview list page in the project, and a video preview detail page. The list page has 6 small video streaming windows playing. When you click on a small video, you will enter the detail page, which has more content to display, including video preview, content introduction, operations, etc.
- Because the process of entering the detail page triggers the above problem, the video instance is destroyed when leaving the list page, which means the connection with SRS is disconnected. When entering the detail page, the video instance is recreated, which means the connection with SRS is established again.
- The video streams in the project are all pulled from the edge nodes in SRS.)
Please make sure to maintain the markdown structure.
TRANS_BY_GPT3
This seems to be a retry strategy problem, belonging to the boundary conditions that need optimization. You can reduce the retry interval a bit.
However, no matter what, when edge sourcing, it is necessary to consider destroying the flow of sourcing to avoid a large number of flows sourcing. Therefore, there will always be a time window where playback is prohibited, which cannot be avoided regardless of optimization efforts.
A better way to bypass this issue is to have the client support multiple edges and retry several times in case of failure. This approach can effectively solve the problem.
Of course, the retry interval for the edges is worth optimizing.
TRANS_BY_GPT3
First of all, thank you for patiently answering. Seeing this issue tagged with "5.0 milestone" instantly made me feel like it's still far away.
Today, I tried following your suggestions and attempted to use flv.js. Since it only allows listening through events and executing reconnection, I had to try it out. During testing, there weren't any issues for now. I will observe the results once it's deployed online.
TRANS_BY_GPT3
I have also encountered this. It seems to be a 3-second delay implemented for security purposes. You can modify the code and it would be best to make the 3-second delay configurable.
https://github.com/ossrs/srs/issues/2215#issuecomment-901682601
By the way, this issue is becoming a frequently asked question.
TRANS_BY_GPT3
Thank you. I have read #2215 and also #2707. Indeed, from the discussion, it seems that this issue arises every few months. The process of modifying the code, recompiling, and repackaging is a bit costly for us, who are working on other languages.
For now, we can only temporarily solve this problem by following the "listen and reconnect" approach mentioned above. We will try other solutions when we have a better one. https://github.com/ossrs/srs/issues/2901#issuecomment-1031491071
TRANS_BY_GPT3
Um, client reconnection is definitely necessary. Don't rely solely on server improvements. Stability is a system engineering concern, and of course, the client needs to be considered.
In general, there is always a way to bypass common problems. Solutions are always more abundant than problems.
TRANS_BY_GPT3
I added an event that allows for notifications and waiting. Additionally, I am considering adding a configuration file to enable or disable this mode. We have already updated our own server using this method. So far, there haven't been any issues. @winlinvip
srs_error_t SrsPlayEdge::on_client_play()
{
srs_error_t err = srs_success;
// start ingest when init state.
if (state == SrsEdgeStateInit) {
state = SrsEdgeStatePlay;
err = ingester->start();
} else if (state == SrsEdgeStateIngestStopping) {
srs_cond_wait(ingester_wait);
srs_trace("on_client_play srs_cond_wait.");
if(state == SrsEdgeStateInit){
state = SrsEdgeStatePlay;
err = ingester->start();
}else if(state == SrsEdgeStateIngestStopping){
return srs_error_new(ERROR_RTMP_EDGE_PLAY_STATE, "state is stopping 1.");
}else{
return srs_error_new(ERROR_RTMP_EDGE_PLAY_STATE, "state is stopping 2.");
}
}
return err;
}
void SrsPlayEdge::on_all_client_stop()
{
// when all client disconnected,
// and edge is ingesting origin stream, abort it.
if (state == SrsEdgeStatePlay || state == SrsEdgeStateIngestConnected) {
SrsEdgeState pstate = state;
state = SrsEdgeStateIngestStopping;
srs_trace("on_all_client_stop begin. ");
ingester->stop();
state = SrsEdgeStateInit;
srs_trace("edge change from %d to %d then %d (init).", pstate, SrsEdgeStateIngestStopping, state);
srs_cond_signal(ingester_wait);
srs_trace("on_all_client_stop srs_cond_signal");
return;
}
}
TRANS_BY_GPT3
Sorry for keeping everyone waiting for so long, I have resolved it in 4.0.267.
The solution of waiting for notification is good (although there is an even simpler solution). 👍
Hahaha, everyone actually gave quite a few solutions, but the best one is still SRS, so we don't need to use menstrual patches anymore. 😄
I checked and the retry time is around 3 seconds. The reason is that when the last playback is triggered, it will stop pulling the stream, and this stop coroutine operation takes a considerable amount of time. Other operations are completed in milliseconds.
[2022-10-10 08:01:45.207] Debug: Start to stop edge coroutine
[2022-10-10 08:01:48.207] Debug: Start to stop edge upstream
[2022-10-10 08:01:48.208] Debug: Start to unpublish source
[2022-10-10 08:01:48.209] Debug: Edge stopped
Looking at the code, it is waiting for a fixed 3 seconds, so the mechanism here is not reasonable.
void SrsPlayEdge::on_all_client_stop() {
if (state == SrsEdgeStatePlay || state == SrsEdgeStateIngestConnected) {
state = SrsEdgeStateIngestStopping;
ingester->stop();
}
void SrsEdgeIngester::stop() {
trd->stop();
upstream->close();
source->on_unpublish();
}
srs_error_t SrsEdgeIngester::cycle() {
while (true) {
do_cycle(); // Interrupted by stop.
srs_usleep(SRS_EDGE_INGESTER_CIMS); // 3s
}
}
A more reasonable approach would be to improve the thread stopping mechanism. The problem is that when the thread is stopped, it interrupts do_cycle(), but the outer sleep is not interrupted (it sleeps again after the interruption). We just need to solve this problem in a simple way.
After the improvement, the stop operation is completed within 2ms, which is generally not triggered by normal user operations.
[2022-10-10 08:07:07.455] Debug: Start to stop edge coroutine
[2022-10-10 08:07:07.457] Debug: Edge stopped
Although condition variables or delayed release are also solutions to this problem, re-releasing the object (somewhat like restarting) is the simplest solution, and the state of the object is also very clear and simple, so the problem is minimized.
Note: Some friends hope that we can implement delayed release, but in fact, there is no perceptible difference in testing compared to re-pulling the stream. However, delayed release of the stream requires a timer, which can make it more complicated. Moreover, delayed release of the stream will cause the stream to be pulled even when no one is playing it, which can also cause issues in certain scenarios.
TRANS_BY_GPT3
@winlinvip This modification is relatively simple, but if the edge nodes are attacked, it may bring down the origin server. This contradicts the purpose of our original servers, doesn't it?
TRANS_BY_GPT3
Preventing attacks is a separate solution, and this retry interval is not designed for preventing attacks.
TRANS_BY_GPT3