Sushi icon indicating copy to clipboard operation
Sushi copied to clipboard

Custom charset support for subtitles

Open tp7 opened this issue 11 years ago • 3 comments

Demux/open/write subtitles with user-specified charset.

Something like this should work for FFmpeg (cp1252 example):

args.extend(['-scodec', 'copy', '-sub_charenc', 'cp1252'])

And then use cp1252 while opening the file instead of utf-8-sig.

This might not be actually needed as subtitles in weird charsets seem to be quite rare.

tp7 avatar Oct 29 '14 05:10 tp7

So here's some feedback.

  1. "Not needed" only for modern anime subtitles.
  2. cp1252 is/was a default charset for Western Europe so it's not "weird".

tophf avatar Oct 29 '14 12:10 tophf

The ffmpeg demuxing issue is fixed in 1515e424bc1e6af7f2914a2c851a55eedf6bed14.

Now we need to figure out how to correctly open subtitles in different codepages in Sushi without any user input.

tp7 avatar Oct 29 '14 12:10 tp7

While the abovementioned commit is a sensible addition to Sushi it'll hardly be possible to correctly autodetect the encoding always (see this), although some (semi-?)solutions exist (chardet, UnicodeDammit).

tophf avatar Oct 29 '14 17:10 tophf