m4b-tool
m4b-tool copied to clipboard
[Solution] Fade in/out effect for MP3s merged into an M4B
I spent quite a bit of time and attempts to figure out how to add a fade in/out effect between MP3s merged into an M4B. I share my solution here for future visitors. Note that this solution could easily be natively integrated in m4b-tool but my schedule is very busy and unfortunately I don't have the bandwidth to do a pull request.
My requirements:
- No re-encoding. The fade in/out effect should be applied on-the-fly between the decoding step and the reencoding step to avoid degrading the quality.
- Preserve all metadata. This means that I can't use FFmpeg to re-encode and glue the files together at the same time. The final lossless merge operation
ffmpeg -f concat -f copyrun bym4b-toolis required to preserve them.
My solution:
find . -iname '*.mp3' -print0 | xargs -0 -I{} -P 8 ffmpeg -i {} -f lavfi -i anullsrc -max_muxing_queue_size 9999 -map_metadata 0 -strict experimental -movflags +faststart -vn -y -ab 196k -ar 44100 -ac 2 -acodec libfdk_aac -filter_complex '[0]afade=t=in:d=1:curve=tri[a]; [1]atrim=0:0.7[t]; [a][t]acrossfade=d=0.7:o=1:c1=tri:c2=nofade' -f mp4 {}.m4b
m4b-tool merge -vvv --debug --no-conversion --include-extensions=m4b --output-file="merged.m4b" .
Note:
For the conversion step I directly use a FFmpeg command (ffmpeg -i {} -f lavfi -i […]) instead of m4b-tool for two reasons:
m4b-toolsilently ignores the--ffmpeg-paramfor the Fraunhofer FDK AAC (libfdk_aac) codec (!) becausem4b-tooldirectly runsffmpeginstead of using theFfmpeg.phpexecutable abstraction.- Note however that
--ffmpeg-paramis properly applied when using the native FFmpeg AAC Encoder (aac) codec. I use the Fraunhofer FDK AAC codec as it has a better encoding quality for a given bitrate compared to the native aac encoder.
- Note however that
- The
--ffmpeg-paramoption ofm4b-toolindiscriminately applies to both the conversion step (when using the native FFmpeg AAC Encoder) and the merge step (no matter what). This is due to the fact that they both use theFfmpeg.phpexecutable abstraction.- But we don't want to do that as it would apply the fade filter twice!
- Additionally, the FFmpeg parameters
-f concat -c copyused bym4b-toolfor the merge aren't compatible with FFmpeg filters. Removing these options would both force a re-encoding (which degrades the sound quality) and drop the individual metadata of each converted file (they are preserved thanks to the-f concat -c copyoptions).
Explanations: The interesting parts are the following options in the first line. They add a fade-in + fade-out effect losslessly without an extra re-encoding step thanks to a filtergraph:
-f lavfi -i anullsrc
-filter_complex '[0]afade=t=in:d=1:curve=tri[a]; [1]atrim=0:0.7[t]; [a][t]acrossfade=d=0.7:o=1:c1=tri:c2=nofade'
Detailed break down for the curious:
ffmpegis provided with two stream inputs:- The mp3 file:
-i {}. - A libavfilter input virtual device (
-f lavfi) that just inputs silent audio (-i anullsrc). Check Step 3 to see why we need it.
- The mp3 file:
- Step 1:
[0]afade=t=in:d=1:curve=tri[a]adds a fade-in effect at the start of the decoded file.[0]is used as the input of theafadefilter command. It corresponds to the first input passed to FFmpeg. Note that we can't use this filter for the fade-out at the end of the input stream as we would need to provide an absolute time offset in the stream, which we can't calculate within the filter pipeline. (Filter streams are non-rewindable and theafadefilter command doesn't support relative time offsets to the end of the stream).t=infor a fade-in effect. Since no start timestis specified, the effect applies at the beginning of the file.d=1means that the fade-in effect has a total duration of 1 second.curve=trito select a triangular linear fade-in transition function.[a]to direct the output of this step to a named streama.
- Step 2:
[1]atrim=0:0.7[t]cuts theanullsrcvirtual silent stream to last 0.7 seconds. It needs to have the same duration as the one specified by thedparameter ofacrossfadein the next filter.[1]is the input of theatrimfilter command. This corresponds to the second input passed to FFmpeg, hereanullsrc.0:0.7is the trim window. Here the trim will only keep the 0.7 seconds of the silentanullsrcstream.
- Step 3:
[a][t]acrossfade=d=0.7:o=1:c1=tri:c2=nofadeadds a cross fade effect at the end of the decoded stream[a]+ start of the second stream[t]. I use a trick (detailed below) to make it only add a fade-out effect at the end of[a]without changing its duration.[a]is used as the first input ofacrossfade. It corresponds to the output of the first step i.e. the decoded file stream with a fade-in effect at the start.[t]is used as the second input ofacrossfade. It corresponds to the output of the trim in the second step i.e. a silent stream with a duration of 0.7 seconds.d=0.7is the duration of the fade-out effect. It's important for it to be equal to theatrimlength of Step 2.- If it's longer than the
atrimstep then the effect won't be applied at all (the second input stream needs to have a duration that is at least as long as the crossfade effect). - If it's shorter than the
atrimstep, then a silence is added to the end of the output stream. We don't want that to increase the duration of the stream and add a silent section at the end, but instead only add a fade-out effect.
- If it's longer than the
o=1means that the two streams should overlap during the cross-fade (fade out the first stream and fade in the second stream at the same time). This is the main trick of thisfilter_complexpipeline.- During the last 0.7 seconds,
[a]fades out while[t]fades in at the same time. - By the time the 0.7s
[a]fade-out is done, the[t]silent stream fade-in is also over (because we trim it to 0.7s which is also the cross-fade duration). - Overall only a fade-out effect is applied as the second stream
[t]is silent so the fade-in of[t]doesn't affect the output stream (a no-op).
- During the last 0.7 seconds,
c1=trito select a triangular linear fade-out transition for the first stream.c2=nofadeto select an identity curve for the fade-in transition of the second stream. The choice of this curve shouldn't matter at the[t]stream is silent anyway.- The output of
acrossfadeis the output of the whole filter pipeline.
Phew, thank you for this huge and detailed investigation.
I (personally) do not have ANY use case for this - fading in does indeed modify the audio in a way I never would like to have it. Furthermore I don't think this is really an issue... more like a detailed guide to achieve something.
The --ffmpeg-param thing was a quick and dirty approach to provide some extended feature, but it was a really, REALLY bad idea. It causes more issues than it solves in my opinion.
What I should have done instead was to provide a small plugin api to modify commands before they are getting executed. Example:
// my-plugin.php
m4btool_register_command_plugin(function(array $command, CommandContext $context) {
if(in_array("ffmpeg", $command, true)) {
return $command;
}
// modify command as you wish
// ....
// then return it
return $command
});
And then running
m4b-tool merge --command-plugin="my-plugin.php" ....
What do you think? Would this be better for your use case?
Hmm I think that a plugin API would still have a learning curve and wouldn't be very convenient for one-off solutions. Just like with the --ffmpeg-param you would need to understand which commands m4b-tools runs and in which order. You'd additionally have to figure out how you should patch the array making sure that you only apply the changes at the right steps of the process.
A plugin API could definitely be useful if you plan on welcoming plugin contributions. But then it would require substantial effort to maintain these plugins considering that they would patch the command (not necessarily nicely in nice and future-proof ways).
In my case the hardest was to figure out where/how the ffmpeg commands were built, that --ffmpeg-param didn't behave the way I assumed it would, and finally deciding that it would just be less effort to add an echo right before the ffmpeg commands get executed so that I can just grab the commands and modify them manually.
I think that a great starting point would be to print the ffmpeg commands that m4b-tool runs (maybe by default to make them easier to discover). People who want custom behaviors could just use --dry-run, modify the ffmpeg commands, and manually run them. If they want to contribute the feature back to m4b-tool they can add a new option and do a PR.
What do you think?
I think that a great starting point would be to print the ffmpeg commands that m4b-tool runs (maybe by default to make them easier to discover). People who want custom behaviors could just use --dry-run, modify the ffmpeg commands, and manually run them. If they want to contribute the feature back to m4b-tool they can add a new option and do a PR.
Oh that is easy. Just use --debug. Maybe it would be nice to have ONLY the commands printed, so an option with --command-logfile or something may be the solution for this.
@sandreas Does it also work with FDK AAC? iirc the command was built differently but I'm not sure if the debug log works anyway or not.