qencoder icon indicating copy to clipboard operation
qencoder copied to clipboard

Audio desync for VFR input.

Open w-barath opened this issue 4 years ago • 7 comments

Every VFR input I've fed qencoder produces output with audio out of sync

I remember similar issues with Xvid, x264 in the early days when those were being used by front-ends that encoded the video separately and then remuxed them when the job was done. Wild guess that the yuv420 stream being piped to aomenc lacks the frame timestamps, or that aomenc drops them, and/or qencoder makes no attempt to restore the lost timestamps during muxing.

I'm sure I can work around this by adding custom ffmpeg flags to convert VFR to CFR by inserting duplicated frames. In my experience that can introduce judder. Adding extra frames for aomenc to digest also seems like something to avoid, lol.

So, anyone have some pointers to achieve this fix by other means? Maybe there's a commandline tool to extract the timestamp data from the original file and apply it to the output?

w-barath avatar Dec 23 '20 01:12 w-barath

So far I found this:

mkvextract original.mkv timestamps_v2 1:timestamps.txt

that gets me the timestamps but now how to merge them, lol.

Hopefully when I figure this out, this issue can become a feature request to add a flag to apply these changes automatically to mkv output files to correct for VFR audio desync.

w-barath avatar Dec 23 '20 01:12 w-barath

mkvmerge -o amended.mkv --timestamps 1:timestamps.txt output.mkv

Does the trick, assuming the timestamps are taken from the video stream and applied to the video stream.

However I'm still getting large desync of about 1.5s at startup (audio is delayed, video first frame at 0ms) and about 5.5s at the end (eyeballing, could be significantly different).

This is out of my wheelhouse for debugging.

I generated timestamp files for both the input and the output. Here's the first few lines of each, input followed by output:

# timestamp format v2
83
125
166
208
250
292
333
# timestamp format v2
0
42
83
125
167
209
250
292
334

Perhaps more interesting is the end...

1523146
1523188
1523230
[ about 5.5s of lines cut ] 
1528860
1528902
1528943.708333
1523147
1523188
1523229.708333

Apart from that the two files are pretty much lock step plus or minus a millisecond every second frame.

There were no complaints on the log while encoding this file.

I guess I should find a public content VFR file to reproduce the issue with?

w-barath avatar Dec 23 '20 02:12 w-barath

I don't know if I can help, but I will give my two cents on how I did it with neav1e:

  1. Piping has to be set to drop the timestamps from the demuxer VSYNC = "-vsync drop";
  2. Extract Timestamps mkvextract " + '\u0022' + VideoInput + '\u0022' + " timestamps_v2 0:" + '\u0022' + Path.Combine(TempPath, TempPathFileName, "vsync.txt") + '\u0022'
  3. Set mkvmerge muxing command: VFRCMD = "--timestamps 0:" + '\u0022' + Path.Combine(TempPath, TempPathFileName, "vsync.txt") + '\u0022';
  4. After encode mux it with mkvmerge: mkvmerge --output " + '\u0022' + VideoOutput + '\u0022' + " " + VFRCMD + " --language 0:und --default-track 0:yes " + '\u0022' + Path.Combine(MainWindow.TempPath, "temp.mkv") + '\u0022'

Side note: '\u0022' = quotation mark

Hope this helps...

Alkl58 avatar Dec 23 '20 11:12 Alkl58

So far I found this:

mkvextract original.mkv timestamps_v2 1:timestamps.txt

Btw. that looks like you are extracting the timestamps of the audio and not the video... Generally "0:" is the index of the video.

Alkl58 avatar Dec 23 '20 11:12 Alkl58

@Alkl58 thanks, that point was worth considering. However in this instance the video was stream 1, audio was stream 0. I already verified that by using MKVToolNixGUI and ffprobe before I extracted the tracks. Then to double-check, I subtracted the frame timings from each other and got 24fps, which is correct. The audio track is more like 51fps with 19-20ms per frame. You can see the timings I quoted above if you'd like to check for yourself.

To anyone else having the same problem as me, regardless of the software you are using (ie not qencoder) you can easily use MKVToolNixGUI to visualise the stream ID numbers. There's also MediaInfo and of course, ffprobe, which everyone using qencoder will already have at their disposal.

I wish GitHub allowed screenshots so I could post just how clearly this info is presented via those tools. Also, MKVToolNixGUI lets you click on a video stream and choose a frame timing file to use with that stream when re-muxing, which makes it easier for those who are less comfortable with the commandline.

The source you quoted actually generates the same identical commandline that I was using, except that it assumes the wrong stream ID and would have tried to merge the video frame timings with the audio track.

The real problem seems to be that qencoder lost 137 frames from the source video somehow, also shown in the list of timings above. sigh.

Maybe I should file a feature request that qencoder add a sanity check at the end to double-check that the final muxing output has the same number of frames as the input .ivf files? There's lovely accounting in 3 places for the number of frames per chunk, but the program happily concludes that its work was done correctly even when the output has zero frames.

w-barath avatar Dec 24 '20 05:12 w-barath

I don't want to seem overly critical of qencoder. One of its best features comes from its over-abundance of source splitting. When you're done encoding a file and you preview it, if you notice a scene where it needs a few extra bits, you can delete the scene from the "encoded" folder and remove the reference to it from done.json, then re-encode the file. It will speedily find the missing scene, re-encode just that one scene, re-mux the file, and you can view your results in very short order. That's a very handy feature if you're pushing the bounds of acceptable quality and want to be able to correct small flaws without re-encoding the whole file, and without the trouble of using a GOP-chop tool to slice & dice the pieces manually.

This could be a pro-level tool if the papercuts were resolved... namely having to enter the identical coding settings each time to re-start encoding (because qencoder restarts fresh each time it finishes, presumably to avoid memory leaks accumulating) not having a way to preview the encoded .ivf chunks while encoding is under way (MKVToolNixGUI is helpful for that) and not having a timeline to relate the scene chunk number when you find a flaw increases the pain of manually removing the scene .ivf chunk and its "done" record. If those things were seamlessly part of the UI, as part of an output preview pane, this would seriously rock and @natis1 would have a very marketable product.

w-barath avatar Dec 24 '20 06:12 w-barath

qencoder should not restart fresh if you restart it, assuming you close it gracefully. it might not save if you kill -9 it but you can kill it and interrupt encoding through other kill codes.

VFR will have to be implemented by the backend but it is definitely something I intend on implementing soon. I am working on more pro level tools like multiple presets per input and among them is VFR. I do ultimately want it to be usable by anyone who needs to accomplish transcoding.

As is though it's a totally free tool, not pro, and as I currently have no real way to monetize it, it being used by 10 people or by everyone at every major studio doesn't matter much.

natis1 avatar Dec 24 '20 06:12 natis1