ccextractor icon indicating copy to clipboard operation
ccextractor copied to clipboard

[BUG] Odd output when inputting an MKV file

Open Southpaw1496 opened this issue 3 years ago • 4 comments

CCExtractor version:

CCExtractor detailed version info
	Version: 0.91
	Git commit: Unknown
	Compilation date: 2021-07-26
	File SHA256: Could not open file
Libraries used by CCExtractor
	Tesseract Version: 4.1.1
	Leptonica Version: leptonica-1.81.1
	libGPAC Version: 1.0.1
	zlib: 1.2.11
	utf8proc Version: 2.6.1
	protobuf-c Version: 1.4.0
	libpng Version: 1.6.37
	FreeType 
	libhash
	nuklear
	libzvbi

Necessary information

  • Is this a regression (i.e. did it work before)? No
  • What platform did you use? Mac (with Homebrew)
  • What were the used arguments? None

Video links

  • https://drive.google.com/file/d/1B9ZeUZ-Pv1dUeu0aJP5G1s_8-_SzA_xU/view?usp=sharing (ZIP encrypted, password shared with relevent people).

Additional information

MKV was ripped from a disk using MakeMKV. I've included the output of running ccextreactor [filename] in the below output.zip file. Output.zip Please let me know if any more information is required.

Southpaw1496 avatar Aug 06 '21 11:08 Southpaw1496

Hi @Southpaw1496, I would like to look into the issue. Can you please send the password to the archive at : [email protected]

sheharyaar avatar Aug 25 '21 23:08 sheharyaar

Here is another file of a Bugs Bunny short, unencrypted this time https://drive.google.com/file/d/1cmntXqJFZGRdoNGLljPBYqFBgeckJlO7/view?usp=sharing

Output for this file: Output-bugs.zip

Southpaw1496 avatar Jan 03 '22 10:01 Southpaw1496

Bugs file has been made available here: https://sampleplatform.ccextractor.org/sample/178

Logs from output zip above:

CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: bugs.mkv
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: bugs.mkv
File seems to be a Matroska/WebM container
Analyzing data in Matroska mode


Document type: matroska
Timecode scale: 1000000
Muxing app: libmakemkv v1.16.4 (1.3.10/1.5.2) darwin(x64-release)
Writing app: MakeMKV v1.16.4 darwin(x64-release)

Track entry:
    Track number: 1
    UID: 1
    Type: video
    Codec ID: V_MPEG2

Track entry:
    Track number: 2
    UID: 2
    Type: audio
    Codec ID: A_AC3
    Language: eng
    Name: Stereo

Track entry:
    Track number: 3
    UID: 3
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng

Track entry:
    Track number: 4
    UID: 4
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng
 99%  |  06:50
100%  |  06:50
Output file: bugs_eng.(null)
Output file: bugs_eng_1.(null)

Found no AVC track. 

Total frames time:	  00:00:00:000  (0 frames at 29.97fps)
Done, processing time = 1 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

It creates an empty .srt, and the two files for the VOBSUB ones (albeit with a "(null)" extension?), but no conteint is in either file.

canihavesomecoffee avatar Jan 03 '22 12:01 canihavesomecoffee

So, the issue here is that we don't support VOBSUB subtitles. To support it, we need to create 2 files, .idx and .sub. We generate .idx file (although incorrect), but no .sub file.

Current .idx file:-

# VobSub index file, v7 (do not modify this line!)
Headers...

+ timestamp: 00:00:01:101, filepos: 000000000
+ timestamp: 00:00:08:708, filepos: 000001000
  • Header is correct
  • timestamp: Missing, but correct (stored in time_start)
  • filepos:- Missing. Need to get correct file positions according to the .sub file

We also need to write correct data to the .sub file.

Reference,

  • https://www.matroska.org/technical/subtitles.html
  • mkvextract

PunitLodha avatar Jul 04 '22 07:07 PunitLodha