Fix for MP4 audio stream not playing in some video players
Description
This PR addresses an issue with how the final MP4 container is formed with ffmpeg during rolling records. The GStreamer pipeline is encoding to mp3 using the lamemp3 codec, but a lot of native MP4 video players don't seem to like having a mp3 audio stream and they end up not playing the audio (Video players like VLC will play with no problem). From previous commits, aac was initially removed to fix the audio and video streams being out of sync.
This fix does the following:
- Re-enable
faacin the GStreamer pipeline and bump it's bitrate to320kbps. - Change the
ffmpegprocess to:- Copy the video stream as is, so the video quality is not degraded.
- Re-encode the audio stream from
aactoaac, but at192kbpsand perform asynchronous resampling at 1,000 samples per second.
This is a sample recording using these changes.
Related issues
- #17
- #18
- Loosely related and, potentially, the culprit for the audio stream desync when using
aac.
- Loosely related and, potentially, the culprit for the audio stream desync when using
Potential problems
- Re-enabling
faacin the GStreamer audio pipeline might introduce audio desync in recordings that aren't from the rolling/replay feature. I'll need to test this to see if this happens. If it does, we can go back tomp3in the pipeline, but still do the encoding toaacwithffmpeg. - There might be a performance cost to me bumping the bitrate from
128kbpsto320kpbsin the GStreamer pipeline. I haven't noticed any major performance cost, but something to keep an eye on.- I chose a higher bitrate to kinda help with the lossy-to-lossy encoding we do later when saving the final video with
ffmpeg. It's not ideal doing lossy-to-lossy, since it can lead to audio quality degradation, but it's the best I can come up with at the moment.
- I chose a higher bitrate to kinda help with the lossy-to-lossy encoding we do later when saving the final video with
- The asynchronous resampling of the audio stream can kinda seem distorted in the final output, since it's essentially attempting to fill/squeeze in the audio stream with the video stream. You can sorta hear it in the sample video from above. The distortion I'm hearing in that sample is the background music in the game sounding like it's been "stretched", if that makes any sense.
Other notes
I'm still trying to figure out a better solution for this. I think if we can figure out how to get the VAAPI H264 encoder to have the proper timescale/timestamps, we won't have the audio desync issue and have to do any audio resampling with ffmpeg (Or re-encoding the audio stream too).
- Re-enabling
faacin the GStreamer audio pipeline might introduce audio desync in recordings that aren't from the rolling/replay feature. I'll need to test this to see if this happens. If it does, we can go back tomp3in the pipeline, but still do the encoding toaacwithffmpeg.
Seems like re-enabling faac didn't break non-rolling/replay recordings, so manual recordings are not being broken.
On that same note, you can tell the difference in audio between the rolling/replay recordings and the manual recordings. The rolling/replay recordings definitely have distortion in the audio stream with the resampling.
Edit:
Wanted to add another sample for the rolling/replay recordings. I wanted to see if the audio seemed distorted in a dialogue heavy scene, but I can't really hear it in this instance.
So sorry for not replying on this earlier. This look great. I will try to review and test soon.
@safijari when you will merge it to main branch?
I was also facing desynced audio issues on Decky Recorder videos.
So I managed to improve my FFmpeg script in which I use to edit my recordings, and the approach I used was to change how I merge the videos together. I used to use the Concat demuxer approach, but I changed it to the Concat protocol approach, no desynced audio anymore and I didn't needed to re-encode the audio from MP3 to AAC.
I also noticed it would be possible to change the GStreamer pipeline from MP3 to AAC:
cmd = (
cmd
+ f' pulsesrc device="Recording_{monitor}" ! audio/x-raw, channels=2 ! audioconvert ! avenc_aac target=bitrate bitrate={self._audioBitrate} cbr=true ! sink.audio_0'
)
Changed the lamemp3enc to avenc_aac, so now it is not necessary to re-encode using FFmpeg, preserving the audio quality, not sure if this would solve the issue though, have you @antib0t tried this approach at any moment?