srs WebRTC: 使用FFmpeg内置opus编码，音频aac转opus时滋滋爆音

Description(描述)

只要使用FFmpeg内置opus，就一定爆音，这个可以重现；

Patch Commit ID: https://github.com/ossrs/srs/commit/8d61c2a064315dfa83d5b29e1068cef00d192561

SRS Version(版本): developv5.0.36

编译FFmpeg

使用FFmpeg内置opus：--enable-decoder=opus --enable-encoder=opus
使用libopus库：--enable-libopus

SRS Config(配置):

listen              1935;
max_connections     1000;
daemon              off;
srs_log_tank        console;

http_server {
    enabled         on;
    listen          8080;
    dir             ./objs/nginx/html;
}

http_api {
    enabled         on;
    listen          1985;
}

rtc_server {
    enabled on;
    listen 8000; # UDP port
    # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#config-candidate
    #candidate $CANDIDATE;
    candidate 10.254.44.205;
}

vhost __defaultVhost__ {
    rtc {
        enabled     on;
        # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtmp-to-rtc
        rtmp_to_rtc on;
        # @see https://ossrs.net/lts/zh-cn/docs/v4/doc/webrtc#rtc-to-rtmp
        rtc_to_rtmp on;
    }
    http_remux {
        enabled     on;
        mount       [vhost]/[app]/[stream].flv;
    }
}

Replay(重现)

Please describe how to replay the bug? (重现Bug的步骤)

rtmp推流： ffmpeg -stream_loop -1 -re -i 264_aac_basline_48k.mp4 -c copy -f flv "rtmp://127.0.0.1/live/livestream"
rtc播放，打开播放器https://127.0.0.1/players/rtc_player.html，播放： https://127.0.0.1/players/rtc_player.html

Expect(期望行为)

rtc播放正常

Aug 09 '22 02:08 chundonglinlin

更改opus库后，得删除FFmpeg了重新编译：

rm -rf objs/ffmpeg/*
./configure 
make

Aug 10 '22 01:08 winlinvip

Specify using the built-in opus in FFmpeg, with the option:

  --ffmpeg-opus=on|off      Whether enable the FFmpeg native opus codec. Default: off

After changing the opus library, you need to delete FFmpeg and recompile it.

rm -rf objs
./configure --ffmpeg-opus=on
make

TRANS_BY_GPT3

Jan 06 '23 09:01 winlinvip

The main reason is that the frame_size set in ffmpeg's opusenc.c (note, not libopusenc.c) is 120, and opus has a sampling rate of 48000, which means each frame is 2.5ms.

From my testing, it seems that approximately every 5 input frames are needed to obtain one output frame from the encoder. Moreover, the latency is high and the size is small, as shown in the following figure.

The PCM frames from pts 2936 to 2954 are input to the encoder, and the frame at pts 2589 is the output from the encoder. It can be observed that there is a significant delay of 300ms+ and the size is very small.

Any experts who know how to fix this issue, please help and reply to this problem.

TRANS_BY_GPT3

Mar 23 '23 13:03 xiaozhihong

Is it possible to fix it by upgrading FFmpeg to 5.1?

Mar 23 '23 13:03 winlinvip

Is it possible to fix it by upgrading FFmpeg to 5.1?

Test FFmpeg last release 5.1.3, the problem already solved.

SRS will update FFmpeg from 4.x to 5.1.3

Mar 29 '23 08:03 xiaozhihong

Following the test method mentioned above, switching to the ffmpeg5.1.3 version, the issue of loud crackling electrical noise still persists.

Added debugging logs,

srs_error_t SrsAudioTranscoder::decode_and_resample(SrsAudioFrame *pkt)
{
    srs_error_t err = srs_success;

    dec_packet_->data = (uint8_t *)pkt->samples[0].bytes;
    dec_packet_->size = pkt->samples[0].size;

    srs_trace("decode_and_resample: dec_packet_->size=%d", dec_packet_->size);

    // Ignore empty packet, see https://github.com/ossrs/srs/pull/2757#discussion_r759797651
    if (!dec_packet_->data || !dec_packet_->size){
        return err;
    }

    char err_buf[AV_ERROR_MAX_STRING_SIZE] = {0};
    int error = avcodec_send_packet(dec_, dec_packet_);
    if (error < 0) {
        return srs_error_new(ERROR_RTC_RTP_MUXER, "submit to dec(%d,%s)", error,
            av_make_error_string(err_buf, AV_ERROR_MAX_STRING_SIZE, error));
    }

    new_pkt_pts_ = pkt->dts + pkt->cts;
    while (error >= 0) {
        error = avcodec_receive_frame(dec_, dec_frame_);
        if (error == AVERROR(EAGAIN) || error == AVERROR_EOF) {
            return err;
        } else if (error < 0) {
            return srs_error_new(ERROR_RTC_RTP_MUXER, "Error during decoding(%d,%s)", error,
                av_make_error_string(err_buf, AV_ERROR_MAX_STRING_SIZE, error));
        }

        // Decoder is OK now, try to init swr if not initialized.
        if (!swr_ && (err = init_swr(dec_)) != srs_success) {
            return srs_error_wrap(err, "resample init");
        }

        int in_samples = dec_frame_->nb_samples;
        const uint8_t **in_data = (const uint8_t**)dec_frame_->extended_data;
        int idx = 0;
        do {
            /* Convert the samples using the resampler. */
            int frame_size = swr_convert(swr_, swr_data_, enc_->frame_size, in_data, in_samples);
            if ((error = frame_size) < 0) {
                return srs_error_new(ERROR_RTC_RTP_MUXER, "Could not convert input samples(%d,%s)", error,
                    av_make_error_string(err_buf, AV_ERROR_MAX_STRING_SIZE, error));
            }
            srs_trace("idx=%d in_samples=%d, enc_->frame_size=%d, frame_size=%d", idx++, in_samples, enc_->frame_size, frame_size);

            in_data = NULL; in_samples = 0;
            if ((err = add_samples_to_fifo(swr_data_, frame_size)) != srs_success) {
                return srs_error_wrap(err, "write samples");
            }
        } while (swr_get_out_samples(swr_, in_samples) >= enc_->frame_size);
    }

    return err;
}

Findings:

After each frame is decoded by libopus, swr_convert is used once.
After each frame is decoded by ffmpeg-opus, swr_convert is used 8 or 9 times.

Audio encoder initialization:

TRANS_BY_GPT4

Oct 18 '23 16:10 chundonglinlin

srs srs copied to clipboard

WebRTC: 使用FFmpeg内置opus编码，音频aac转opus时滋滋爆音

srs
srs copied to clipboard