mlt
mlt copied to clipboard
Cropped FLAC produces audio artifacts (crackles, pops)
Description
When rendering an MLT file referencing a FLAC audio file, the rendered output will contain audible artifacts (crackles, pops) if the producers in offset is greater than zero (beginning is cropped), otherwise the audio is clean.
Reproduction
- Generate a 5 second sine wave FLAC audio using ffmpeg:
ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" sine.flac
- Create two MLT files crackle.mlt and clean.mlt with the following content:
<?xml version="1.0" standalone="no"?>
<mlt LC_NUMERIC="C" version="7.0.0" title="Shotcut version ARCH-21.08.29" producer="main_bin">
<profile description="PAL 4:3 DV or DVD" width="1920" height="1080" progressive="1" sample_aspect_num="1" sample_aspect_den="1" display_aspect_num="16" display_aspect_den="9" frame_rate_num="60" frame_rate_den="1" colorspace="709"/>
<chain id="sine" out="00:00:04.983">
<property name="length">00:00:05.000</property>
<property name="eof">pause</property>
<property name="resource">sine.flac</property>
<property name="mlt_service">avformat-novalidate</property>
<property name="seekable">1</property>
<property name="audio_index">0</property>
<property name="video_index">-1</property>
<property name="mute_on_pause">0</property>
<property name="xml">was here</property>
</chain>
<playlist id="playlist0">
<entry producer="sine" in="<in_offset>" out="00:00:04.983"/>
</playlist>
</mlt>
- replace
<in_offset>
of crackle.mlt with00:00:01.000
- replace
<in_offset>
of clean.mlt with00:00:00.000
- play crackle.mlt:
melt crackle.mlt
- play clean.mlt:
melt clean.mlt
You will notice that rendering crackle.mlt produces artifacts while clean.mlt doesn't
Workaround
If you use uncompressed WAV audio instead of FLAC there are no issues whatsoever.
Technical Info
- MLT version:
7.0.1
- FFmpeg version:
n4.4
- OS:
5.10.68-1-MANJARO #1 SMP PREEMPT Wed Sep 22 12:29:47 UTC 2021 x86_64 GNU/Linux
Additional Info
The MLT data used here was created by reproducing the problem in Shotcut (ARCH-21.08.29) and stripping the resulting MLT file to the bare minimum (as far as I could tell without knowing too much about MLT XML). The same problem can be reproduced using kdenlive (21.08.1) as it also uses MLT as its rendering backend. I also tried different FLAC bit-depths (16 and 24 bit) as well as sampling-rates (44,1kHz and 48kHz), but the problem stays the same.
Raw FLAC does not seek properly and must be muxed into some other container. I have not been able to fix that. It might be a limitation in FFmpeg. Someone else is welcome to try.
Thanks for the quick reply! I will try using FLAC in a container (e.g. Matroska) and report back if I find another workaround. I'll also try to reproduce the problem using FFmpeg only. Is there a way to somehow log what MLT inputs into FFmpeg?
I created an mka file (Matroska Audio) using the following command-line: ffmpeg -i sine.flac -map 0 -c copy sine.mka
Then I created mka.mlt referencing said mka file:
<?xml version="1.0" standalone="no"?>
<mlt LC_NUMERIC="C" version="7.0.0" title="Shotcut version ARCH-21.08.29" producer="main_bin">
<profile description="PAL 4:3 DV or DVD" width="1920" height="1080" progressive="1" sample_aspect_num="1" sample_aspect_den="1" display_aspect_num="16" display_aspect_den="9" frame_rate_num="60" frame_rate_den="1" colorspace="709"/>
<chain id="sine" out="00:00:04.983">
<property name="length">00:00:05.000</property>
<property name="eof">pause</property>
<property name="resource">sine.mka</property>
<property name="mlt_service">avformat-novalidate</property>
<property name="seekable">1</property>
<property name="audio_index">0</property>
<property name="video_index">-1</property>
<property name="mute_on_pause">0</property>
<property name="xml">was here</property>
</chain>
<playlist id="playlist0">
<entry producer="sine" in="00:00:01.000" out="00:00:04.983"/>
</playlist>
</mlt>
When I play mka.mlt using melt mka.mlt
then audio is still crackling, so the container does not seem to make a difference.
In Kdenlive we've noticed crackles/pops when using 60fps projects. It doesn't happen with 24/30 fps though... Don't know if this is relevant but hope it helps find a fix:
https://bugs.kde.org/show_bug.cgi?id=410726
See also https://forum.shotcut.org/t/audio-distortion-when-split/29165
Still don't know how to fix it and need to workaround it. Good luck finding a fix.
@frdbr The problem with the kdenlive bug you reference is that it is more about general audio distortions, many of which have been fixed or significantly improved, while this is specifically about FLAC as a source.
I have run into this bug now, and it's really reproducable.
Using 60fps video, I export the audio as a wav file, and edit in Audacity. From there the audio is exported as a .flac file, and imported back into Kdenlive. The beginning and end of the file is corpped.
I did not notice this bug while using 25fps, but once I started working in 60fps it is definitely there. When I export the audio as mp3 from audacity, and replace the clip in kdenlive, there is no audio crackling.
If I play the exported .flac file in another application, there is no crackling either.
Simple steps to reproduce from @j-b-m:
Ok, so using the sample sine wav file provided in the Kdenlive bug report: sine.flac
The problem can be reproduced using this melt command: melt -profile atsc_1080p_60 sine.flac in=735 -consumer avformat:test1.wav
The resulting file has clear audio cracks at start. Using an atsc_1080p_25 profile I didn't get audio cracks
Thanks for the shortened setup @bmatherly ! It did not work for my setup with the sine generated via ffmpeg, but I adapted your approach and updated my instructions.
Out of curiosity I also tried to seek 1 second into the FLAC using FFmpeg. Strangely enough this produces a crystal clear wav:
ffmpeg -i sine.flac -ss 00:00:01.000 sine_ffmpeg.wav
So FFmpeg is somehow able to seek into FLAC without a problem. Maybe this helps narrowing down the problem as @ddennedy assumed it might be a limitation in FFmpeg.
I have a proposed fix for this specific FLAC issue. Can someone please test this proposed change with their own test case?
diff --git a/src/modules/avformat/producer_avformat.c b/src/modules/avformat/producer_avformat.c
index 9e8a63df..d3e2188d 100644
--- a/src/modules/avformat/producer_avformat.c
+++ b/src/modules/avformat/producer_avformat.c
@@ -2792,9 +2792,10 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
if ( self->seekable || int_position > 0 )
{
int64_t ahead_threshold = 2;
- if ( codec_context->codec_id == AV_CODEC_ID_WMAPRO )
+ if ( codec_context->codec_id == AV_CODEC_ID_WMAPRO ||
+ codec_context->codec_id == AV_CODEC_ID_FLAC )
{
- // WMAPro needs more tolerance for sync detection
+ // Some codecs needs more tolerance for sync detection
ahead_threshold = 4;
}
@bmatherly I just tested your changes and it does make a difference. Instead of several audible clicks there is now just one.
Update: Just out of curiosity, I increased ahead_threshold
to 8 and the resulting WAV did not produce any audible clicks.
Update2: I tested a 60 second sine in 44,1 and 48 kHz with ahead_threshold=8
in order to rule out longer click periods, but the audio was clear.
Thanks for your testing. I think that 8 frames of tolerance is too much - it would make the lip sync be off for lower frame rates. I think I see that the problem is the PTS offset calculation does not account for samples already in the buffer. And FLAC frames are very large - leaving many samples in the buffer. Here is another proposed fix. I hope you can continue to help me keep testing.
diff --git a/src/modules/avformat/producer_avformat.c b/src/modules/avformat/producer_avformat.c
index 9e8a63df..7449c5aa 100644
--- a/src/modules/avformat/producer_avformat.c
+++ b/src/modules/avformat/producer_avformat.c
@@ -2705,6 +2705,7 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
int channels = codec_context->channels;
int audio_used = self->audio_used[ index ];
+ int audio_used_at_start = audio_used;
int ret = 0;
int discarded = 1;
int sizeof_sample = sample_bytes( codec_context );
@@ -2763,6 +2764,7 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
int n = FFMIN( audio_used, *ignore );
*ignore -= n;
audio_used -= n;
+ audio_used_at_start -= n;
memmove( audio_buffer, &audio_buffer[ n * channels * sizeof_sample ],
audio_used * channels * sizeof_sample );
}
@@ -2771,7 +2773,9 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
// Skip this on non-seekable, audio-only inputs.
if ( !discarded && pkt->pts >= 0 && ( self->seekable || self->video_format ) && *ignore == 0 && audio_used > samples / 2 )
{
- int64_t pts = pkt->pts;
+ double timebase = av_q2d( context->streams[ index ]->time_base );
+ int64_t pts_offset = lrint((double)audio_used_at_start / timebase / (double)codec_context->sample_rate);
+ int64_t pts = pkt->pts - pts_offset;
if ( self->first_pts != AV_NOPTS_VALUE )
pts -= av_rescale_q(self->first_pts,
context->streams[self->video_index]->time_base,
@@ -2780,7 +2784,6 @@ static int decode_audio( producer_avformat self, int *ignore, const AVPacket *pk
pts -= av_rescale_q(context->start_time,
AV_TIME_BASE_Q,
context->streams[ index ]->time_base);
- double timebase = av_q2d( context->streams[ index ]->time_base );
int64_t int_position = llrint( timebase * pts * fps );
int64_t req_position = llrint( timecode * fps );
int64_t req_pts = llrint( timecode / timebase );
@bmatherly Sure I'm glad I can help! The problem is I cannot apply the patch using git apply
. Git always tells me that it cannot find the text pattern in the target file. I applied the first patch manually, but to avoid mistakes on my side I think it is better to apply you patch directly. Or can you push a branch containing your changes? Then we can be sure that I test the right thing.
Here you go: https://github.com/mltframework/mlt/pull/885
Thanks for pushing the branch @bmatherly! I tested your fix and the audio is crystal clear now :+1:
Just tested here, and for me there is also no more popping noises when using FLAC.
Thanks for testing. In addition to the FLAC issue, we have a high concern about regression with other formats. If you are willing, it would be very helpful if you could help us test other formats.
For example:
- Put a clip on the timeline
- Snip the clips at various places
- Play the timeline - does it sound the same as the currently released version?
Thanks @bmatherly and @ddennedy for putting so much effort in fixing this issue!