untrunc icon indicating copy to clipboard operation
untrunc copied to clipboard

Recover of files where there is a stream with an unknown codec

Open kcperry opened this issue 7 years ago • 6 comments

I'm hoping there is a solution for this. I have an mp4 file that was produced by a Cadillac's PDR (performance data recorder). The file has a data stream in addition to the audio and video streams. Here is an ffprobe from a good file.

~ ~ ~ Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '180526_132240_00006_.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 creation_time : 2018-05-25T13:22:40.000000Z Duration: 00:25:05.18, start: 0.000000, bitrate: 5690 kb/s Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 5334 kb/s, 30.52 fps, 30.53 tbr, 100k tbn, 200k tbc (default) Metadata: creation_time : 2018-05-25T13:22:42.000000Z handler_name : VideoHandler encoder : H.264
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 320 kb/s (default)
Metadata:
creation_time : 2018-05-25T13:22:42.000000Z
handler_name : SoundHandler Stream #0:2(eng): Data: none (marl / 0x6C72616D), 31 kb/s (default) Metadata: creation_time : 2018-05-25T13:23:22.000000Z handler_name : Marlin PDR 1.0 Unsupported codec with id 0 for input stream 2 ~ ~ ~

Obviously with the data stream not being a supported or known codec, I don't expect that to be recovered. All I need is the audio and video.

When I run untrunc with the good file and my corrupted file, I get this

~ ~ ~ Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '180526_132240_00006_.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 creation_time : 2018-05-25T13:22:40.000000Z Duration: 00:25:05.18, start: 0.000000, bitrate: 5690 kb/s Stream #0:0(eng): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 5334 kb/s, 30.52 fps, 30.53 tbr, 100k tbn, 200k tbc (default) Metadata: creation_time : 2018-05-25T13:22:42.000000Z handler_name : VideoHandler encoder : H.264 Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 320 kb/s (default) Metadata: creation_time : 2018-05-25T13:22:42.000000Z handler_name : SoundHandler Stream #0:2(eng): Data: none (marl / 0x6C72616D), 31 kb/s (default) Metadata: creation_time : 2018-05-25T13:23:22.000000Z handler_name : Marlin PDR 1.0 Mismatch between time offsets and size offsets: Time offsets: 64819 Size offsets: 64820 Mismatch between time offsets and sample_to_chunk offsets: Time offsets: 64819 Chunk offsets: 64820 Track codec: mp4a Success for no particular reason.... [aac @ 0x5617d30662a0] Sample rate index in program config element does not match the sample rate index configured by the container. [aac @ 0x5617d30662a0] Inconsistent channel configuration. [aac @ 0x5617d30662a0] get_buffer() failed Duration: 0

Invalid length. -22. Wrong match in track: 0 Track codec: avc1 avc1: failed for not particular reason Track codec: Found 0 packets Track duration: 0 movie timescale: 100000 track timescale: 440996 Track duration: 0 movie timescale: 100000 track timescale: 100000 Track duration: 0 movie timescale: 100000 track timescale: 0 ~ ~ ~

Two questions...

  1. is it possible to recover a stream of an unknown codec?
  2. is it possible to recover just the known streams in a file when there is good reference file? If this is possible now, what would be the steps needed?

Also, these files are racing track laps so they definitely going to be similar.

kcperry avatar May 28 '18 07:05 kcperry

Hi, untrunc needs to somehow know the length of each sample. For different codecs there are different ways to find the length. In untrunc the matchSample method trys to identify the codec. If that worked the getLength will try to get the length. Because the different streams are all clumbed together in the mdat Atom, there is no way to simply ignore it. So what you need to do is to figure out the bit-stream syntax of your data-samples, so you can tell the length. Either by finding the manuals of your codec, or by using a hexeditor.

anthwlock avatar Jun 09 '18 10:06 anthwlock

So to confirm, untrunc needs to understand everything in the Atom in order to process since the muxing of streams doesn't allow for the selective removal of known vs. unknown bit-stream syntaxes.

Did I restate that correctly?

If so, then assuming I could determine the bit-stream syntax, how would I feed that into untrunc? Can such a thing be done or would the software have to be patched?

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Keith C. Perry, MS E.E. Managing Member, DAO Technologies LLC (O) +1.215.525.4165 x2033 (M) +1.215.432.5167 www.daotechnologies.com

On Sat, Jun 9, 2018 at 6:37 AM, Anthon Lockwood [email protected] wrote:

Hi, untrunc needs to somehow know the length of each sample. For different codecs there are different ways to find the length. In untrunc the matchSample method trys to identify the codec. If that worked the getLength will try to get the length. Because the different streams are all clumbed together in the mdat Atom, there is no way to simply ignore it. So what you need to do is to figure out the bit-stream syntax of your data-samples, so you can tell the length. Either by finding the manuals of your codec, or by using a hexeditor.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ponchio/untrunc/issues/110#issuecomment-395958752, or mute the thread https://github.com/notifications/unsubscribe-auth/AXG7ZhkREz9dj6l7bnJUsqjfjzT-9GSBks5t66VlgaJpZM4UPuyK .

kcperry avatar Aug 06 '18 16:08 kcperry

Yes, that is correct. After you figured out the bit-stream syntax, you have to tell untrunc

  1. how to detect a packet of your codec, given an uchar pointer
  2. how long this packet is.

Take a look on this commit where I added support for the sawb codec. As you see it can be quite easy, if you are lucky.

anthwlock avatar Aug 11 '18 19:08 anthwlock

Thanks for that... I would say though, it's only easy if you understand this stuff to a deeper level which I do not and unfortunately I don't have the time to take a deeper dive or get [back] into C++ programming. At least know I know what to do if that stance changes.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Keith C. Perry, MS E.E. Managing Member, DAO Technologies LLC (O) +1.215.525.4165 x2033 (M) +1.215.432.5167 www.daotechnologies.com

On Sat, Aug 11, 2018 at 3:06 PM, Anthon Lockwood [email protected] wrote:

Yes, that is correct. After you figured out the bit-stream syntax, you have to tell untrunc

  1. how to detect a packet of your codec, given an uchar pointer
  2. how long this packet is.

Take a look on this commit https://github.com/anthwlock/untrunc/commit/d77c09f720c60ac297f6915137a63362dd0cfda3 where I added support for the sawb codec. As you see it can be quite easy, if you are lucky.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ponchio/untrunc/issues/110#issuecomment-412295474, or mute the thread https://github.com/notifications/unsubscribe-auth/AXG7ZueyU-JOyleavt6E8rQ27b_-R1Woks5uPys1gaJpZM4UPuyK .

kcperry avatar Aug 11 '18 21:08 kcperry

I say it can be easy because in the case of sawb only three loc were added/changed. Feel free to ask me any questions whenever you don't understand something related to untrunc.

anthwlock avatar Aug 12 '18 18:08 anthwlock

Thanks again Anthon.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Keith C. Perry, MS E.E. Managing Member, DAO Technologies LLC (O) +1.215.525.4165 x2033 (M) +1.215.432.5167 www.daotechnologies.com

On Sun, Aug 12, 2018 at 2:09 PM, Anthon Lockwood [email protected] wrote:

I say it can be easy because in the case of sawb only three loc were added/changed. Feel free to ask me any questions whenever you dont understand something related to untrunc.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ponchio/untrunc/issues/110#issuecomment-412361102, or mute the thread https://github.com/notifications/unsubscribe-auth/AXG7Zk3x2zNVhxWfwHo6jR_NbiSaaW_Wks5uQG9wgaJpZM4UPuyK .

kcperry avatar Aug 12 '18 18:08 kcperry