srs icon indicating copy to clipboard operation
srs copied to clipboard

推高视频码率流,srs core dump

Open WenzhiShao opened this issue 1 year ago • 2 comments

Note: Please read FAQ before file an issue, see 2716

Note: 提问前,请先看FAQ, 即 2716

Description(描述)

Please description your issue here(描述你遇到了什么问题)

  1. SRS Version(版本): 5.0

  2. SRS Log(日志):

[2022-07-15 12:27:38.295][Trace][8959][8m199219] client finished.
[2022-07-15 12:27:38.295][Trace][8959][98p183s4] TCP: clear zombies=1 resources, conns=1, removing=0, unsubs=0
[2022-07-15 12:27:38.295][Trace][8959][8m199219] TCP: disposing #0 resource(HttpStream)(0x556fc0f2f6b0), conns=1, disposing=1, zombies=0
[2022-07-15 12:27:42.637][Trace][8959][622au47p] Hybrid cpu=0.00%,11MB
[2022-07-15 12:27:44.798][Trace][8959][it129s36] RTMP client ip=xxx, fd=12
[2022-07-15 12:27:44.835][Trace][8959][it129s36] complex handshake success
[2022-07-15 12:27:44.861][Trace][8959][it129s36] connect app, tcUrl=rtmp://localhost/live, pageUrl=, swfUrl=, schema=rtmp, vhost=localhost, port=1935, app=live, args=(obj)
[2022-07-15 12:27:44.861][Trace][8959][it129s36] edge-srs ip=xxx, version=4.0.253, pid=2496, id=0
[2022-07-15 12:27:44.861][Trace][8959][it129s36] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-07-15 12:27:45.008][Trace][8959][it129s36] client identified, type=flash-publish, vhost=localhost, app=live, stream=123, param=?, duration=0ms
[2022-07-15 12:27:45.008][Trace][8959][it129s36] connected stream, tcUrl=rtmp://localhost/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=123, param=?, args=(obj)
[2022-07-15 12:27:45.008][Trace][8959][it129s36] new source, stream_url=/live/123
[2022-07-15 12:27:45.008][Trace][8959][it129s36] source url=/live/123, ip=xxx, cache=1, is_edge=0, source_id=/
[2022-07-15 12:27:45.008][Trace][8959][it129s36] new source, stream_url=/live/123
[2022-07-15 12:27:45.008][Trace][8959][it129s36] RTC bridge from RTMP, rtmp2rtc=0, keep_bframe=0, merge_nalus=0
[2022-07-15 12:27:45.008][Trace][8959][it129s36] hls: win=60000ms, frag=10000ms, prefix=, path=./objs/nginx/html, m3u8=[app]/[stream].m3u8, ts=[app]/[stream]-[seq].ts, aof=2.00, floor=0, clean=1, waitk=1, dispose=0ms, dts_directly=1
[2022-07-15 12:27:45.008][Trace][8959][it129s36] ignore disabled exec for vhost=__defaultVhost__
[2022-07-15 12:27:45.008][Trace][8959][it129s36] http: mount flv stream for sid=/live/123, mount=/live/123.flv
[2022-07-15 12:27:45.008][Trace][8959][it129s36] start publish mr=0/350, p1stpt=20000, pnt=5000, tcp_nodelay=0
[2022-07-15 12:27:45.035][Trace][8959][it129s36] got metadata, width=1920, height=1080, vcodec=7
[2022-07-15 12:27:45.035][Trace][8959][it129s36] 44B video sh,  codec(7, profile=Baseline, level=4, 1920x1080, 0kbps, 0.0fps, 0.0s)
[2022-07-15 12:27:47.637][Trace][8959][622au47p] Hybrid cpu=3.00%,14MB, cid=3,2, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:40,oth:0,buf:0)
[2022-07-15 12:27:52.637][Trace][8959][622au47p] Hybrid cpu=2.00%,21MB, cid=3,2, timer=63,0,0, clock=0,49,1,0,0,0,0,0,0, free=1, objs=(pkt:0,raw:0,fua:0,msg:40,oth:0,buf:0)
[2022-07-15 12:27:53.134][Trace][8959][vcc4u0t9] RTMP client ip=xxxx, fd=14
[2022-07-15 12:27:53.203][Trace][8959][vcc4u0t9] complex handshake success
[2022-07-15 12:27:53.229][Trace][8959][vcc4u0t9] connect app, tcUrl=rtmp://xxxxx/live, pageUrl=, swfUrl=, schema=rtmp, vhost=101.200.224.229, port=1935, app=live, args=null
[2022-07-15 12:27:53.229][Trace][8959][vcc4u0t9] protocol in.buffer=0, in.ack=0, out.ack=0, in.chunk=128, out.chunk=128
[2022-07-15 12:27:53.394][Trace][8959][vcc4u0t9] ignore AMF0/AMF3 command message.
[2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] ignore AMF0/AMF3 command message.
[2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] client identified, type=rtmp-play, vhost=xxx, app=live, stream=livestream, param=, duration=-1ms
[2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] connected stream, tcUrl=rtmp://xxxxx/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, stream=livestream, param=, args=null
[2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] new source, stream_url=/live/livestream
[2022-07-15 12:27:53.487][Trace][8959][vcc4u0t9] source url=/live/livestream, ip=xxxx cache=1, is_edge=0, source_id=/
  1. SRS Config(配置):
# main config for srs.
# @see full.conf for detail config.

listen              1935;
max_connections     1000;
#srs_log_tank        file;
srs_log_file        ./objs/server.log;
daemon              on;
pid                 objs/server.pid;
http_api {
    enabled         on;
    listen          1985;
}
http_server {
    enabled         on;
    listen          8080;
    dir             ./objs/nginx/html;
}
rtc_server {
    enabled on;
    listen 8000; # UDP port
    # @see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#config-candidate
    candidate $CANDIDATE;
}
vhost __defaultVhost__ {
    hls {
        enabled         on;
    }
    http_remux {
        enabled     on;
        mount       [vhost]/[app]/[stream].flv;
    }
    rtc {
        enabled     on;
        # @see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#rtmp-to-rtc
        rtmp_to_rtc off;
        # @see https://github.com/ossrs/srs/wiki/v4_CN_WebRTC#rtc-to-rtmp
        rtc_to_rtmp off;
    }
    cluster {
        mode            local;
        origin_cluster  on;
    }
}

Replay(重现)

Please describe how to replay the bug? (重现Bug的步骤)

  1. 搭建一个简单的edge srs -> origin srs->play srs,从edge推流,play srs 拉流
  2. 使用python结合ffmpeg 推流,该视频的分辨率较高
  3. 从play srs 拉流

Expect(期望行为)

Please describe your expectation(描述你期望发生的事情) 一开始卡顿播放,然后origin srs崩掉 通过mobaxterm的可视化性能发现,一瞬间内存耗尽 查看core dump文件,显示

Core was generated by `./objs/srs -c ./myconfig/server.conf'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000556fbfec8272 in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::size (this=0x30)
    at /usr/include/c++/7/bits/stl_vector.h:671
671           { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }

定位到库文件的vector,猜测是内存分配耗尽 额外信息 推流的python代码

import cv2
import subprocess

# RTMP服务器地址
rtmp = r'rtmp://localhost/live/123'   # 后面的123自己随意起的,可改成其它的,如123321/456等等
# 读取视频并获取属性
# 还可以把摄像头0换成rtsp地址,进行rtsp的推流
cap = cv2.VideoCapture('./video/303.mp4')
size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
sizeStr = str(size[0]) + 'x' + str(size[1])
command = ['ffmpeg',
           '-y', '-an',
           '-f', 'rawvideo',
           '-vcodec', 'rawvideo',
           '-pix_fmt', 'bgr24',
           '-s', sizeStr,
           '-r', '25',
           '-i', '-',
           '-c:v', 'libx264',
           '-pix_fmt', 'yuv420p',
           '-preset', 'ultrafast',
           '-f', 'flv',
           rtmp]
pipe = subprocess.Popen(command, shell=False, stdin=subprocess.PIPE
                        )
while cap.isOpened():
    success, frame = cap.read()
    if success:
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        pipe.stdin.write(frame.tostring())
cap.release()
pipe.terminate()

WenzhiShao avatar Jul 15 '22 04:07 WenzhiShao

What is the stack of coredump like? Can you execute bt to check?

How high is the video bitrate?

TRANS_BY_GPT3

winlinvip avatar Aug 07 '22 04:08 winlinvip

Hello, the content of the stack frame is as follows. The term "high bitrate" was a mistake on my part. After checking, I found that it is a regular video with a resolution of 1920x1080, 25fps, and 4081kbps. It is just higher compared to the default source.flv file. When I experimented, I noticed that this issue does not occur when pushing the source.flv file.

#1  0x000055ec3249e585 in SrsConfig::get_vhost_coworkers (this=0x55ec33e6d300,
    vhost="__defaultVhost__") at src/app/srs_app_config.cpp:4955
#2  0x000055ec3243f381 in SrsRtmpConn::playing (this=0x55ec33ffd670, source=
    0x55ec34019660) at src/app/srs_app_rtmp_conn.cpp:615
#3  0x000055ec3243e6f0 in SrsRtmpConn::stream_service_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:532
#4  0x000055ec3243d5d8 in SrsRtmpConn::service_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:403
#5  0x000055ec3243c0d8 in SrsRtmpConn::do_cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:216
#6  0x000055ec32445180 in SrsRtmpConn::cycle (this=0x55ec33ffd670)
    at src/app/srs_app_rtmp_conn.cpp:1457
#7  0x000055ec3247481e in SrsFastCoroutine::cycle (this=0x55ec33ffd800)
    at src/app/srs_app_st.cpp:272
#8  0x000055ec324748ba in SrsFastCoroutine::pfn (arg=0x55ec33ffd800)
    at src/app/srs_app_st.cpp:287
#9  0x000055ec3258a969 in _st_thread_main () at sched.c:363
#10 0x000055ec3258b205 in st_thread_create (
    start=0x55ec3247489a <SrsFastCoroutine::pfn(void*)>, arg=0x55ec33ffd800, joinable=1,
    stk_size=65536) at sched.c:694

TRANS_BY_GPT3

WenzhiShao avatar Aug 07 '22 06:08 WenzhiShao

You have enabled the origin cluster origin_cluster on;, but did not follow the configuration of the origin cluster, so there are issues.

Of course, there should not be a coredump here.

TRANS_BY_GPT3

winlinvip avatar Aug 19 '22 12:08 winlinvip

Confirming that it is a configuration issue, enabling the origin server cluster origin_cluster on; without configuring cluster.coworkers will cause a crash, and it happens consistently.

vector<string> SrsConfig::get_vhost_coworkers(string vhost)
{
    vector<string> coworkers;

    SrsConfDirective* conf = get_vhost(vhost);
    if (!conf) {
        return coworkers;
    }

    conf = conf->get("cluster");
    if (!conf) {
        return coworkers;
    }

    conf = conf->get("coworkers");
    for (int i = 0; i < (int)conf->args.size(); i++) {

Because there is no check here.

This issue is essentially a configuration problem, not a common problem, so it will only be fixed in version 5.0.

TRANS_BY_GPT3

winlinvip avatar Aug 22 '22 03:08 winlinvip