mlt Cropped FLAC produces audio artifacts (crackles, pops)

Cropped FLAC produces audio artifacts (crackles, pops)

Open tuffnerdstuff opened this issue 2 years ago • 6 comments

Description

When rendering an MLT file referencing a FLAC audio file, the rendered output will contain audible artifacts (crackles, pops) if the producers in offset is greater than zero (beginning is cropped), otherwise the audio is clean.

Reproduction

Generate a 5 second sine wave FLAC audio using ffmpeg: ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" sine.flac
Create two MLT files crackle.mlt and clean.mlt with the following content:

<?xml version="1.0" standalone="no"?>
<mlt LC_NUMERIC="C" version="7.0.0" title="Shotcut version ARCH-21.08.29" producer="main_bin">
  <profile description="PAL 4:3 DV or DVD" width="1920" height="1080" progressive="1" sample_aspect_num="1" sample_aspect_den="1" display_aspect_num="16" display_aspect_den="9" frame_rate_num="60" frame_rate_den="1" colorspace="709"/>
  <chain id="sine" out="00:00:04.983">
    <property name="length">00:00:05.000</property>
    <property name="eof">pause</property>
    <property name="resource">sine.flac</property>
    <property name="mlt_service">avformat-novalidate</property>
    <property name="seekable">1</property>
    <property name="audio_index">0</property>
    <property name="video_index">-1</property>
    <property name="mute_on_pause">0</property>
    <property name="xml">was here</property>
  </chain>
  <playlist id="playlist0">
    <entry producer="sine" in="<in_offset>" out="00:00:04.983"/>
  </playlist>
</mlt>

replace <in_offset> of crackle.mlt with 00:00:01.000
replace <in_offset> of clean.mlt with 00:00:00.000
play crackle.mlt: melt crackle.mlt
play clean.mlt: melt clean.mlt

You will notice that rendering crackle.mlt produces artifacts while clean.mlt doesn't

Workaround

If you use uncompressed WAV audio instead of FLAC there are no issues whatsoever.

Technical Info

MLT version: 7.0.1
FFmpeg version: n4.4
OS: 5.10.68-1-MANJARO #1 SMP PREEMPT Wed Sep 22 12:29:47 UTC 2021 x86_64 GNU/Linux

Additional Info

The MLT data used here was created by reproducing the problem in Shotcut (ARCH-21.08.29) and stripping the resulting MLT file to the bare minimum (as far as I could tell without knowing too much about MLT XML). The same problem can be reproduced using kdenlive (21.08.1) as it also uses MLT as its rendering backend. I also tried different FLAC bit-depths (16 and 24 bit) as well as sampling-rates (44,1kHz and 48kHz), but the problem stays the same.

Oct 15 '21 15:10 tuffnerdstuff

Raw FLAC does not seek properly and must be muxed into some other container. I have not been able to fix that. It might be a limitation in FFmpeg. Someone else is welcome to try.

Oct 15 '21 15:10 ddennedy

Thanks for the quick reply! I will try using FLAC in a container (e.g. Matroska) and report back if I find another workaround. I'll also try to reproduce the problem using FFmpeg only. Is there a way to somehow log what MLT inputs into FFmpeg?

Oct 15 '21 16:10 tuffnerdstuff

I created an mka file (Matroska Audio) using the following command-line: ffmpeg -i sine.flac -map 0 -c copy sine.mka Then I created mka.mlt referencing said mka file:

<?xml version="1.0" standalone="no"?>
<mlt LC_NUMERIC="C" version="7.0.0" title="Shotcut version ARCH-21.08.29" producer="main_bin">
  <profile description="PAL 4:3 DV or DVD" width="1920" height="1080" progressive="1" sample_aspect_num="1" sample_aspect_den="1" display_aspect_num="16" display_aspect_den="9" frame_rate_num="60" frame_rate_den="1" colorspace="709"/>
  <chain id="sine" out="00:00:04.983">
    <property name="length">00:00:05.000</property>
    <property name="eof">pause</property>
    <property name="resource">sine.mka</property>
    <property name="mlt_service">avformat-novalidate</property>
    <property name="seekable">1</property>
    <property name="audio_index">0</property>
    <property name="video_index">-1</property>
    <property name="mute_on_pause">0</property>
    <property name="xml">was here</property>
  </chain>
  <playlist id="playlist0">
    <entry producer="sine" in="00:00:01.000" out="00:00:04.983"/>
  </playlist>
</mlt>

When I play mka.mlt using melt mka.mlt then audio is still crackling, so the container does not seem to make a difference.

Oct 15 '21 21:10 tuffnerdstuff

In Kdenlive we've noticed crackles/pops when using 60fps projects. It doesn't happen with 24/30 fps though... Don't know if this is relevant but hope it helps find a fix:

https://bugs.kde.org/show_bug.cgi?id=410726

Oct 15 '21 21:10 frdbr

See also https://forum.shotcut.org/t/audio-distortion-when-split/29165

Still don't know how to fix it and need to workaround it. Good luck finding a fix.

Oct 15 '21 21:10 ddennedy

@frdbr The problem with the kdenlive bug you reference is that it is more about general audio distortions, many of which have been fixed or significantly improved, while this is specifically about FLAC as a source.

Oct 15 '21 21:10 ddennedy

I have run into this bug now, and it's really reproducable.

Using 60fps video, I export the audio as a wav file, and edit in Audacity. From there the audio is exported as a .flac file, and imported back into Kdenlive. The beginning and end of the file is corpped.

I did not notice this bug while using 25fps, but once I started working in 60fps it is definitely there. When I export the audio as mp3 from audacity, and replace the clip in kdenlive, there is no audio crackling.

If I play the exported .flac file in another application, there is no crackling either.

Mar 06 '23 12:03 evertvorster

Simple steps to reproduce from @j-b-m:

Ok, so using the sample sine wav file provided in the Kdenlive bug report: sine.flac

The problem can be reproduced using this melt command: melt -profile atsc_1080p_60 sine.flac in=735 -consumer avformat:test1.wav

The resulting file has clear audio cracks at start. Using an atsc_1080p_25 profile I didn't get audio cracks

Mar 08 '23 13:03 bmatherly

Thanks for the shortened setup @bmatherly ! It did not work for my setup with the sine generated via ffmpeg, but I adapted your approach and updated my instructions.

Out of curiosity I also tried to seek 1 second into the FLAC using FFmpeg. Strangely enough this produces a crystal clear wav:

ffmpeg -i sine.flac -ss 00:00:01.000 sine_ffmpeg.wav

So FFmpeg is somehow able to seek into FLAC without a problem. Maybe this helps narrowing down the problem as @ddennedy assumed it might be a limitation in FFmpeg.

Mar 08 '23 15:03 tuffnerdstuff

I have a proposed fix for this specific FLAC issue. Can someone please test this proposed change with their own test case?

diff --git a/src/modules/avformat/producer_avformat.c b/src/modules/avformat/producer_avformat.c
index 9e8a63df..d3e2188d 100644
--- a/src/modules/avformat/producer_avformat.c
+++ b/src/modules/avformat/producer_avformat.c
@@ -2792,9 +2792,10 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
                if ( self->seekable || int_position > 0 )
                {
                        int64_t ahead_threshold = 2;
-                       if ( codec_context->codec_id == AV_CODEC_ID_WMAPRO )
+                       if ( codec_context->codec_id == AV_CODEC_ID_WMAPRO ||
+                                codec_context->codec_id == AV_CODEC_ID_FLAC )
                        {
-                               // WMAPro needs more tolerance for sync detection
+                               // Some codecs needs more tolerance for sync detection
                                ahead_threshold = 4;
                        }

Mar 09 '23 04:03 bmatherly

@bmatherly I just tested your changes and it does make a difference. Instead of several audible clicks there is now just one.

Update: Just out of curiosity, I increased ahead_threshold to 8 and the resulting WAV did not produce any audible clicks.

Update2: I tested a 60 second sine in 44,1 and 48 kHz with ahead_threshold=8 in order to rule out longer click periods, but the audio was clear.

Mar 09 '23 10:03 tuffnerdstuff

Thanks for your testing. I think that 8 frames of tolerance is too much - it would make the lip sync be off for lower frame rates. I think I see that the problem is the PTS offset calculation does not account for samples already in the buffer. And FLAC frames are very large - leaving many samples in the buffer. Here is another proposed fix. I hope you can continue to help me keep testing.

diff --git a/src/modules/avformat/producer_avformat.c b/src/modules/avformat/producer_avformat.c
index 9e8a63df..7449c5aa 100644
--- a/src/modules/avformat/producer_avformat.c
+++ b/src/modules/avformat/producer_avformat.c
@@ -2705,6 +2705,7 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
 
        int channels = codec_context->channels;
        int audio_used = self->audio_used[ index ];
+       int audio_used_at_start = audio_used;
        int ret = 0;
        int discarded = 1;
        int sizeof_sample = sample_bytes( codec_context );
@@ -2763,6 +2764,7 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
                int n = FFMIN( audio_used, *ignore );
                *ignore -= n;
                audio_used -= n;
+               audio_used_at_start -= n;
                memmove( audio_buffer, &audio_buffer[ n * channels * sizeof_sample ],
                                 audio_used * channels * sizeof_sample );
        }
@@ -2771,7 +2773,9 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
        // Skip this on non-seekable, audio-only inputs.
        if ( !discarded && pkt->pts >= 0 && ( self->seekable || self->video_format ) && *ignore == 0 && audio_used > samples / 2 )
        {
-               int64_t pts = pkt->pts;
+               double timebase = av_q2d( context->streams[ index ]->time_base );
+               int64_t pts_offset = lrint((double)audio_used_at_start / timebase / (double)codec_context->sample_rate);
+               int64_t pts = pkt->pts - pts_offset;
                if ( self->first_pts != AV_NOPTS_VALUE )
                        pts -= av_rescale_q(self->first_pts,
                                                                context->streams[self->video_index]->time_base,
@@ -2780,7 +2784,6 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
                        pts -= av_rescale_q(context->start_time,
                                                                AV_TIME_BASE_Q,
                                                                context->streams[ index ]->time_base);
-               double timebase = av_q2d( context->streams[ index ]->time_base );
                int64_t int_position = llrint( timebase * pts * fps );
                int64_t req_position = llrint( timecode * fps );
                int64_t req_pts =      llrint( timecode / timebase );

Mar 10 '23 04:03 bmatherly

@bmatherly Sure I'm glad I can help! The problem is I cannot apply the patch using git apply. Git always tells me that it cannot find the text pattern in the target file. I applied the first patch manually, but to avoid mistakes on my side I think it is better to apply you patch directly. Or can you push a branch containing your changes? Then we can be sure that I test the right thing.

Mar 10 '23 09:03 tuffnerdstuff

Here you go: https://github.com/mltframework/mlt/pull/885

Mar 10 '23 12:03 bmatherly

Thanks for pushing the branch @bmatherly! I tested your fix and the audio is crystal clear now :+1:

Mar 10 '23 19:03 tuffnerdstuff

Just tested here, and for me there is also no more popping noises when using FLAC.

Mar 13 '23 18:03 evertvorster

Thanks for testing. In addition to the FLAC issue, we have a high concern about regression with other formats. If you are willing, it would be very helpful if you could help us test other formats.

For example:

Put a clip on the timeline
Snip the clips at various places
Play the timeline - does it sound the same as the currently released version?

Mar 13 '23 23:03 bmatherly

Thanks @bmatherly and @ddennedy for putting so much effort in fixing this issue!

Mar 16 '23 12:03 tuffnerdstuff

mlt mlt copied to clipboard

Cropped FLAC produces audio artifacts (crackles, pops)

Description

Reproduction

Workaround

Technical Info

Additional Info

mlt
mlt copied to clipboard