PeerTube icon indicating copy to clipboard operation
PeerTube copied to clipboard

Poor quality of ffmpeg aac encoder

Open artenax opened this issue 1 year ago • 24 comments

Describe the current behavior

Hello. I understand you can not give up ffmpeg aac in favor of fdk (-vbr 5) or at least faac for licensing reasons, but at least change the CBR mode to VBR, according to my tests this will greatly improve quality. The bitrate will be about 200 kbps:

-c:a aac -b:a 256k

V

-c:a aac -q:a 2.3 -cutoff 18000

One more thing: Right now the peertube encoder does not touch (does not transcode) the aac stream provided by user, only in the mp4 container. So the user can provide his own quality aac (encoded by fdk or commercial qaac). But if the aac is in an mkv container, peertube always re-encodes the audio (because it can't detect the bitrate of aac from the mkv container?). You could also make aac sound not re-encode in big cases.

You would have to change something. ffmpeg aac 256k CBR used now is terrible. Just listen with headphones where there is a lot of high frequencies. Actually, not only VBR mode improves quality (it reduces encoder bugs), but also the -cutoff frequency filter. YouTube has always cut off AAC at 16000 Hz.

Steps to reproduce

Describe the expected behavior

Additional information

  • PeerTube instance:
    • URL: https://peervideo.ru/
    • Version: 5.0.1-nightly-2023-02-24

artenax avatar Feb 25 '23 16:02 artenax

收到!

ElonVampire avatar Feb 25 '23 16:02 ElonVampire

It is also useful to add -af volume=-3dB to prevent clipping and artifacts.

artenax avatar Feb 25 '23 17:02 artenax

Hello,

We should use fdk aac encoder if the ffmpeg binary supports it :thinking:

Regarding the variable bitrate for the default aac encoder, I see

Effective range for -q:a is around 0.1-2. This VBR is experimental and likely to get even worse results than the CBR.

in https://trac.ffmpeg.org/wiki/Encode/AAC page

Chocobozzz avatar Feb 25 '23 19:02 Chocobozzz

Since ffmpeg 3.0+ (in 2016) the native aac encoder has been significantly revised and improved. The developers call the CBR mode stable. However, my testing shows that ffmpeg 3.0+ aac now has other bugs, which especially show up in CBR mode and much less in VBR. Also the -cutoff 18000 and -af volume=-2dB options are needed to reduce this artifacts. I can provide you audio samples...

artenax avatar Feb 25 '23 20:02 artenax

fdk-aac is much better than aac I switched to it years ago indeed.

ROBERT-MCDOWELL avatar Feb 25 '23 22:02 ROBERT-MCDOWELL

Thanks @artenax, yes please send your audio sample. Can you also confirm that PeerTube uses libfdk if your ffmpeg binary supports it?

Chocobozzz avatar Feb 26 '23 15:02 Chocobozzz

I just use opus. The latest version of peertube with git ffmpeg seems to support both firefox and chrome now. Previously opus would only work on one browser and not the other with peertube/hls.

vid-bin avatar Feb 26 '23 20:02 vid-bin

Effective range for -q:a is around 0.1-2

However, ffmpeg accepts values higher than 2.0 outside the documentation and the bitrate is higher.

Can you also confirm that PeerTube uses libfdk if your ffmpeg binary supports it?

I don't know about the server side of the PeerTube I use, but it seems that fdk is not used there. I'm just a user/uploader.

fdk-aac is much better than aac I switched to it years ago indeed

Me too. Although, fdk works in 16 bit and we need to be careful with the levels.

Here are examples. (cbr256.m4a) ffmpeg -c:a aac -b:a 256k and (vbr2.3.m4a 210 kbps) ffmpeg -c:aac -q:a 2.3 I don't know if you can hear the difference with the original and between CBR<>VBR in your headphones. Pay attention to the high frequencies.

Try also adding the options -cutoff 18000 -af volume=-2dB (I didn't add them here).

original.flac actually went some way beyond me and it's not exactly lossless (high bitrate aac 317 kbps when exporting from video editing software > youtube opus 137k > flac). But this is a real case and a good stress test.

original.zip aac.zip

artenax avatar Feb 26 '23 21:02 artenax

it's really rare that frequencies go up to 16khz in music, it's often under -10/20db and not really perceptive in digital audio. about clipping, the most efficient I set is -c:a libfdk_aac -b:a 256k -af 'loudnorm=I=-13:LRA=20:TP=-2'} with this setting whatever is voice noise or music, the presence/dynamic is great and no clipping at all. It's really close to YT even better sometimes.

ROBERT-MCDOWELL avatar Feb 26 '23 23:02 ROBERT-MCDOWELL

Guys, it's funny, but I checked some other encoders and they gave good results: ac3 (sonic and ffmpeg) 192k, dts 320k, wma9 std VBR V90 190k and wma9 pro VBR V90 160k (microsoft, from winxp), vorbis vbr (ffmpeg) 128k. And even lame mp3 --cbr -b 128 -q 0 -m j --lowpass 16 had artifacts, but less. ffmpeg aac 128k is even worse. faac is not very good.

artenax avatar Mar 01 '23 19:03 artenax

ffmpeg aac encoder is garbage. You need to have fdk version.

hsn10 avatar Mar 21 '23 18:03 hsn10

For now, this could be fixed with a plugin which changes which encoders are used (such as the vp9 or opus plugins).

FiskFan1999 avatar Jun 02 '23 18:06 FiskFan1999

Any temporary solutions so far? Audio quality is really bad.

@FiskFan1999 I believe we can use peertube-plugin-transcoding-profile-debug temporarily but I am not sure how to do it yet.

Kinuseka avatar Mar 12 '24 00:03 Kinuseka

Can you use -c:a aac -q:a 2.4 -cutoff 18000 -af volume=-2dB ? It's not that bad. It's inaccurate, but not terrible. Try experimenting with advanced settings: ffmpeg -h encoder=aac

AAC encoder AVOptions:
  -aac_coder         <int>        E...A...... Coding algorithm (from 0 to 2) (default twoloop)
     anmr            0            E...A...... ANMR method
     twoloop         1            E...A...... Two loop searching method
     fast            2            E...A...... Default fast search
  -aac_ms            <boolean>    E...A...... Force M/S stereo coding (default auto)
  -aac_is            <boolean>    E...A...... Intensity stereo coding (default true)
  -aac_pns           <boolean>    E...A...... Perceptual noise substitution (default true)
  -aac_tns           <boolean>    E...A...... Temporal noise shaping (default true)
  -aac_ltp           <boolean>    E...A...... Long term prediction (default false)
  -aac_pred          <boolean>    E...A...... AAC-Main prediction (default false)
  -aac_pce           <boolean>    E...A...... Forces the use of PCEs (default false)

Unfortunately, I don't know if PeerTube uses ffmpeg CLI or is controlled through libraries and if you can change additional settings.

artenax avatar Mar 12 '24 02:03 artenax

Any temporary solutions so far? Audio quality is really bad.

Can you give us example and also tell us if your ffmpeg version has libfdk_aac encoder?

Chocobozzz avatar Mar 12 '24 05:03 Chocobozzz

@Chocobozzz said:

Any temporary solutions so far? Audio quality is really bad.

Can you give us example and also tell us if your ffmpeg version has libfdk_aac encoder?

Hi yes, transcoded: https://vid.kinuseka.us/w/2Sr3d9csWFcVoSSm4S5pFb compared to the original soundcloud: https://soundcloud.com/idnull/midnight-murder-club

It is noticeable that the high frequencies on the music sound compressed and not fully defined.

There is in fact no libfdk_aac encoder yet, let me see if installing this makes a difference

Kinuseka avatar Mar 12 '24 07:03 Kinuseka

@Kinuseka Can you tell me if this version is better? https://asso.framasoft.org/drop/r/ULju3jwfkF#NMvYQWsIrqPaKhK2RpdpLkFAUppWrJ+YocYKXqb9ohw=

And can you give your ffmpeg version?

Chocobozzz avatar Mar 12 '24 09:03 Chocobozzz

@Chocobozzz said: @Kinuseka Can you tell me if this version is better? https://asso.framasoft.org/drop/r/ULju3jwfkF#NMvYQWsIrqPaKhK2RpdpLkFAUppWrJ+YocYKXqb9ohw=

And can you give your ffmpeg version?

Oh yeah, that sounds really close to the original.

FFmpeg version: ffmpeg version 4.4.2-0ubuntu0.22.04.1

Kinuseka avatar Mar 12 '24 09:03 Kinuseka

Oh yeah, that sounds really close to the original.

Ok thanks, because I just used the same upload process as you on my local PeerTube that uses ffmpeg 6.1.1 Can you upgrade your ffmpeg version to 6.1 and retry the upload process to see if it fixes the issue?

Chocobozzz avatar Mar 12 '24 09:03 Chocobozzz

@Chocobozzz said:

Oh yeah, that sounds really close to the original.

Ok thanks, because I just used the same upload process as you on my local PeerTube that uses ffmpeg 6.1.1 Can you upgrade your ffmpeg version to 6.1 and retry the upload process to see if it fixes the issue?

ffmpeg 6.1 fixes the audio quality issue. Had to use this repo since there is no official release on ffmpeg v6 for ubuntu 22.04.

Kinuseka avatar Mar 12 '24 10:03 Kinuseka

ffmpeg 6.1 fixes the audio quality issue

Yeah, spectra got better in ffmpeg >= 6.0, but the sound still sucks (in CBR mode). VBR (-q:a) mode is cool. It's most noticeable on the human voice. (when people howl? something like that) VBR mode in ffmpeg 6.1.1 seems even better.

artenax avatar Mar 12 '24 15:03 artenax

If you need ffmpeg 6 with fdk-aac use this static build (put it in /usr/local/bin). It is legal because it contains a Fedora patch that removes patented components (HE profiles). Or install Fedora on the server... Although, no, Fedora doesn't have libx264, unlike my build.

However, there is no zlib there. If you need zlib use this:

ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 5.5.0 (Linaro GCC 5.5-2017.10) 20171010 (ROSA) configuration: --prefix=/opt/ffmpeg --enable-pic --enable-gpl --enable-version3 --enable-static --disable-shared --disable-debug --as=nasm --enable-small --disable-doc --enable-gray --enable-libfdk-aac --enable-libx264 --disable-cuda-nvcc --disable-cuda-llvm --disable-lzma --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-vulkan --pkg-config-flags=--static --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libvpx --enable-libtls

Unfortunately, without SSE optimizations. And there's no AV1 decoder. Anyway, you can compile ffmpeg yourself, it's easy.

artenax avatar Mar 12 '24 15:03 artenax

Can you use -c:a aac -q:a 2.4 -cutoff 18000 -af volume=-2dB ? It's not that bad. It's inaccurate, but not terrible. Try experimenting with advanced settings: ffmpeg -h encoder=aac

AAC encoder AVOptions:
  -aac_coder         <int>        E...A...... Coding algorithm (from 0 to 2) (default twoloop)
     anmr            0            E...A...... ANMR method
     twoloop         1            E...A...... Two loop searching method
     fast            2            E...A...... Default fast search
  -aac_ms            <boolean>    E...A...... Force M/S stereo coding (default auto)
  -aac_is            <boolean>    E...A...... Intensity stereo coding (default true)
  -aac_pns           <boolean>    E...A...... Perceptual noise substitution (default true)
  -aac_tns           <boolean>    E...A...... Temporal noise shaping (default true)
  -aac_ltp           <boolean>    E...A...... Long term prediction (default false)
  -aac_pred          <boolean>    E...A...... AAC-Main prediction (default false)
  -aac_pce           <boolean>    E...A...... Forces the use of PCEs (default false)

Unfortunately, I don't know if PeerTube uses ffmpeg CLI or is controlled through libraries and if you can change additional settings.

Trying this setting using peertube-plugin-transcoding-profile-debug

my configuration is: Transcoding profile:

{
    "vod": [
        {
            "encoderName": "aac",
            "profileName": "AudioBetter",
            "outputOptions": [
                "-c:a aac",
                "-q:a 2.3", 
                "-cutoff 18000"
            ]
        }
    ],
    "live": []
}

Encoders Priorities:

{
  "vod": [
    {
      "encoderName": "aac",
      "streamType": "audio",
      "priority": 1000
    }
  ],

  "live": [ ]
}

Untitled

I can confirm it is being implemented since the setting is present on htop during transcoding

Kinuseka avatar Mar 20 '24 05:03 Kinuseka

With ffmpeg 6.1 the audio has significantly improved with aac vs 4.4 aac

I am not entirely sure if VBR made any difference since I can't tell the difference. It could just be placebo but I guess some parts of the frequency are less crunchy and more detailed now (mid-highs maybe)

Kinuseka avatar Mar 20 '24 05:03 Kinuseka