ccextractor
ccextractor copied to clipboard
[BUG] Odd output when inputting an MKV file
CCExtractor version:
CCExtractor detailed version info
Version: 0.91
Git commit: Unknown
Compilation date: 2021-07-26
File SHA256: Could not open file
Libraries used by CCExtractor
Tesseract Version: 4.1.1
Leptonica Version: leptonica-1.81.1
libGPAC Version: 1.0.1
zlib: 1.2.11
utf8proc Version: 2.6.1
protobuf-c Version: 1.4.0
libpng Version: 1.6.37
FreeType
libhash
nuklear
libzvbi
Necessary information
- Is this a regression (i.e. did it work before)? No
- What platform did you use? Mac (with Homebrew)
- What were the used arguments? None
Video links
- https://drive.google.com/file/d/1B9ZeUZ-Pv1dUeu0aJP5G1s_8-_SzA_xU/view?usp=sharing (ZIP encrypted, password shared with relevent people).
Additional information
MKV was ripped from a disk using MakeMKV. I've included the output of running ccextreactor [filename]
in the below output.zip
file.
Output.zip
Please let me know if any more information is required.
Hi @Southpaw1496, I would like to look into the issue. Can you please send the password to the archive at : [email protected]
Here is another file of a Bugs Bunny short, unencrypted this time https://drive.google.com/file/d/1cmntXqJFZGRdoNGLljPBYqFBgeckJlO7/view?usp=sharing
Output for this file: Output-bugs.zip
Bugs file has been made available here: https://sampleplatform.ccextractor.org/sample/178
Logs from output zip above:
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: bugs.mkv
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]
-----------------------------------------------------------------
Opening file: bugs.mkv
File seems to be a Matroska/WebM container
Analyzing data in Matroska mode
Document type: matroska
Timecode scale: 1000000
Muxing app: libmakemkv v1.16.4 (1.3.10/1.5.2) darwin(x64-release)
Writing app: MakeMKV v1.16.4 darwin(x64-release)
Track entry:
Track number: 1
UID: 1
Type: video
Codec ID: V_MPEG2
Track entry:
Track number: 2
UID: 2
Type: audio
Codec ID: A_AC3
Language: eng
Name: Stereo
Track entry:
Track number: 3
UID: 3
Type: subtitle
Codec ID: S_VOBSUB
Language: eng
Track entry:
Track number: 4
UID: 4
Type: subtitle
Codec ID: S_VOBSUB
Language: eng
99% | 06:50
100% | 06:50
Output file: bugs_eng.(null)
Output file: bugs_eng_1.(null)
Found no AVC track.
Total frames time: 00:00:00:000 (0 frames at 29.97fps)
Done, processing time = 1 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues
It creates an empty .srt, and the two files for the VOBSUB ones (albeit with a "(null)" extension?), but no conteint is in either file.
So, the issue here is that we don't support VOBSUB subtitles.
To support it, we need to create 2 files, .idx
and .sub
. We generate .idx
file (although incorrect), but no .sub
file.
Current .idx file:-
# VobSub index file, v7 (do not modify this line!)
Headers...
+ timestamp: 00:00:01:101, filepos: 000000000
+ timestamp: 00:00:08:708, filepos: 000001000
- Header is correct
- timestamp: Missing, but correct (stored in
time_start
) - filepos:- Missing. Need to get correct file positions according to the
.sub
file
We also need to write correct data to the .sub
file.
Reference,
- https://www.matroska.org/technical/subtitles.html
- mkvextract