AudioAlign icon indicating copy to clipboard operation
AudioAlign copied to clipboard

[REQ] MPEG transport stream support

Open MarcoRavich opened this issue 1 year ago • 6 comments

Hi there, we've just tried to sync some files from our - Canon - cams and discovered that MTS files aren't supported yet (AA goes in a deadlock state when a such kind of file is drag&dropped on it).

It would be much better to avoid remuxing (which could generate problems, as well described here) that forces AVCHD users to perform an unnecessary step.

Thanks in advance.

EDIT: of course @justdan96's tsMuxer may help...

MarcoRavich avatar Jan 21 '24 18:01 MarcoRavich

What is your use case and expected outcome? Could you provide a sample file?

protyposis avatar Jan 21 '24 19:01 protyposis

Hi there, well the "use case" is multicam shooting of a live event.

Here are a couple of files from our - Canon - cameras to test the AA functionality: https://mega.nz/folder/NwFRyYzJ#MKWWK2E8Vp8H_RjxBknFXg

Thanks in advance.

MarcoRavich avatar Jan 22 '24 16:01 MarcoRavich

Assuming you synchronized your video files in Audio Align, what's your expected output? Are you aware that AA does not export video files?

protyposis avatar Jan 22 '24 17:01 protyposis

Assuming you synchronized your video files in Audio Align, what's your expected output?

As said, our installation of AA goes in deadlock state when drop an MTS file on it: so syncing is not possible.

Are you aware that AA does not export video files?

We've succesfully synched MP4 (a/v) files and exported to Vegas EDL: it works correctly.

Hope that helps.

MarcoRavich avatar Jan 22 '24 19:01 MarcoRavich

Thanks, so your use case is EDL export, in which case it makes sense to work directly with video files. I'll have to investigate whether MTS support can be added, but I can't give an estimate if and when this is going to happen.

A workaround is extracting the audio from video files (preferably in .wav format), synchronize the audio files, and manually replace the audio files with the video in the exported EDL file.

protyposis avatar Jan 23 '24 08:01 protyposis

Thanks, so your use case is EDL export, in which case it makes sense to work directly with video files. I'll have to investigate whether MTS support can be added, but I can't give an estimate if and when this is going to happen.

It would be great to support it, 'cause it's - still - used by many cameras.

A workaround is extracting the audio from video files (preferably in .wav format), synchronize the audio files, and manually replace the audio files with the video in the exported EDL file.

That's what I've done, of course.

note: does AA perform better syncs using uncompressed audio (.wav) ?

MarcoRavich avatar Jan 23 '24 13:01 MarcoRavich

Digged GH a bit, maybe some of this resources could help:

  • TSDuck: The MPEG Transport Stream Toolkit
  • mts: a library for parsing mpegts files and streams;
  • tslib: a lightweight ts packaging library;
  • go-astits: a Golang library to natively demux and mux MPEG Transport Streams (ts) in GO;
  • demux-mpegts: basic parsers for common streams;

...anyway note that FFMPEG does support MTS decoding (which are, in many cases, AVC/video + AC3/audio streams).

MarcoRavich avatar Jan 26 '24 14:01 MarcoRavich

Basic TS support has been added in https://github.com/protyposis/AudioAlign/releases/tag/v1.6.0. I validated it with the two sample files you posted above - thanks for providing them. Please let me know if it works for you.

does AA perform better syncs using uncompressed audio (.wav) ?

No, the synchronization results are the same.

Playback works better with uncompressed audio because the FFmpeg access layer for compressed media isn't very stable yet. I recommend to always use .wav files for a stable experience, even though many compressed files work without issues.

protyposis avatar Jan 27 '24 19:01 protyposis

Cool, thanks.

Testing.

note: can you reccomend a "best" algorithm to sync live concerts shooting from different cameras ?

MarcoRavich avatar Jan 27 '24 22:01 MarcoRavich

I usually use the "HK02" algorithm. In general, fingerprinting algorithms are designed to be used with recordings that all contain the same common audio source. This is the case, e.g., in a proscenium stage setup where all cameras record toward a stage with amplified PA audio.

Your case here is special because there is no main audio mix and the cameras were recording from different places and captured quite different audio. Even if they recorded the same instruments, there are differences that fingerprinting methods aren't designed for; e.g., their timings are slightly different (due to the different distances and resulting sound delays). You'll have to experiment, but given the short video durations, the single found match of HK02 could be sufficient for synchronization.

protyposis avatar Jan 29 '24 10:01 protyposis

Your case here is special because there is no main audio mix

Well, of course we have the "mixer audio" recoding too... ...in this regard: how are the different audio formats (bit/frequency rates and codecs) managed ? For example: if it decodes lossy audio tracks through FFMPEG, has the avoid loudness scaling (the -drc_scale 0 parameter) been considered too ? Does it resamples all tracks to the highest one (ex. 96KHz) before fingerprinting ?

Last but not least, we've noticed some overlapping between tracks in the aligning results but we'll detail this (with screenshots) in a separate issue soon.

MarcoRavich avatar Jan 30 '24 07:01 MarcoRavich

Well, of course we have the "mixer audio" recoding too...

I wasn't talking about a separate recording. Ideally, every device records this signal from a common source, e.g. PA speakers - that's when fingerprint sync works best.

how are the different audio formats (bit/frequency rates and codecs) managed ?

Fingerprinting works on low resolution signals (5–8 kHz), so they are all downsampled. Loudness is irrelevant.

protyposis avatar Jan 30 '24 08:01 protyposis

Closing because MTS support has been added.

protyposis avatar Feb 02 '24 19:02 protyposis