mlt
mlt copied to clipboard
Audio fade-in always late by one frame
I was hoping this would be my first bug fix, but I was immediately baffled by the code.
Reproducing this is easy; I created some noise and silence with audacity, then with a 24 fps profile (to make it easy to find frame boundaries later), added a one-second clip of noise, then a one-second clip of silence right after it, then faded in/out the noise by one frame.
The output was as I expected, based on experience; the fade-in was late by one frame. Looking at the output in audacity, frame 0 was silent, and the sound faded in between the beginning of frame 1 and the beginning of frame 2. It faded out accurately, though.
What was baffling is, in the mlt file, the first fade was in="0" out="1" (with "in" missing because apparently its default value is 0). That would make sense; the fade is supposed to start at the beginning of frame 0 and end at the beginning of frame 1. But like I said, the actual fade is late by one frame. The baffling part was the fade-out transition; it was in="22" out="23"; the one-second noise clip actually ends before frame 24.
So it seems that either the in/out frames for volume transitions aren't very intuitive, or kdenlive is (inadvertently?) working around an mlt bug. I would expect frame numbers to represent a specific point, e.g. the time of the beginning of the given frame. But the frame values that represent the fade-out transition are not intuitive to me.
The "easy" fix would be to move the fade-in transition frames back one, but in my example, that leads to an "in" of -1, which appears to represent the end of the output; the result was the sound faded in evenly from the beginning of the output to the end of the output. So, although this easy fix could work for most fade-in transitions, it won't work for a fade-in starting at frame 0.
My mlt file was generated with kdenlive 18, but the same issue happens in kdenlive 20; instead of frame numbers, there are frame times with millisecond resolution. But the same nonintuitive frame coordinate system is in use, and fade-ins are still one frame late.
So it appears that I can't move forward with this analysis, until I get some clarity on the meaning of the coordinate system.
Sorry to lean on you again like this...I was really hoping this would be the first bug fix I contributed.
MLT filters cannot represent a parameter value change in a single frame. The minimum is 2. So, if you want the volume to go from 0 to 1.0 over two frames, the first frame is 0 (silent) and the second frame is 1.0. Also, the filter is simple and does not interpolate values between frames such as 0.5 between frames 0 and 1. It does, however, know its previous value and can ramp over the sample in its frame to the value for the current frame. IOW, it is backward looking, not forward. On frame 0 all it knows is 0. On frame 1, it knows the previous value was 0, and the desired value is 1. You can look at normalize/filter_volume.c.
I'm sorry to hear it's a deep architectural issue. It seems to me there are plenty of filters that make sense applied over only one frame. Also, the kdenlive GUI misrepresents the effect of a one-frame fade-in filter, by drawing it as applying to one frame. (I wouldn't expect it to interpolate between frames such as 0.5; I would expect it to interpolate starting at the beginning of the current frame and ending at the beginning of the next frame.)
Oh well. If I feel brave, I can look through the filters to see how they assume the presence of at least two frames.
Also, in MLT, out is the frame number of the last frame of whatever, e.g.out = duration - 1 by default. frame 23 is the last frame of one second of 24 fps.
Yeah, it seems that the issue is that frame numbers in mlt refer to a frame, instead of the boundaries between frames.
This is similar to how the MS Windows graphics coordinate system works (i.e. coordinates refer to pixels) versus how the Macintosh graphics coordinate system works (i.e. coordinates refer to the one-dimensional lines that separate pixels).
This has consequences, e.g. the test for point-in-rectangle on MS Windows doesn't include the pixels on the right-border and bottom-border, even though those areas are relevant for update events, and I have to work around that when programming MS Windows GUI code.