youtube-dl icon indicating copy to clipboard operation
youtube-dl copied to clipboard

Add an option to discard style information from TTML subtitles

Open filip-hejsek opened this issue 1 year ago • 5 comments

It would be helpful to me to have an option to discard the styles from TTML subtitles.

This PR adds a --ttml-convert-no-style option which does just that.

filip-hejsek avatar Aug 11 '22 03:08 filip-hejsek

I think this needs some examples. I couldn't see any issue in the tracker that this would fix - is there one, or maybe in yt-dlp?

If the code is left as-is, I'd suggest changing the sense of convert_style=True to strip_style=False.

But ...

Which target sttl formats support styles? Rather than introducing a new option, could we just invent new sttl format names (eg with -plain appended) to stand for the named format with no styles?

Then tests!

dirkf avatar Aug 11 '22 14:08 dirkf

Thanks for your feedback.

There is no issue in the tracker for this. The problem this is trying to solve is that some players or devices (TVs) don't support HTML in SRT and just display the tags.

I don't know whether this is also an issue for the other subtitle formats.

One problem with my patch is that it confusingly only removes styles for subtitles converted from TTML. A proper solution would also add support for stripping the styling from unconverted subtitle files. However, that would expand the scope significantly to support many formats. Alternatively, only stripping from SRT could be supported.

So i think maybe it would be best to add a srt-plain format and add code to strip HTML styles from SRT files (which i'm willing to implement as it's mostly just a regex replace). Stripping styles from other formats could be implemented in the future if/when someone needs it.

What do you think?

filip-hejsek avatar Aug 12 '22 03:08 filip-hejsek

Sorry for this delayed response. I think srt-plain isn't a bad idea. Note the clean_html() function in utils.py

dirkf avatar Nov 10 '22 16:11 dirkf

@dirkf any progress on this? styles on ttml subtitles are really a problem (especially their font size thing)

fenopa avatar Feb 05 '23 18:02 fenopa

@dirkf any hope for implementing this? maybe these styling options should be changeable?

    SUPPORTED_STYLING = [
        'color',
        'fontFamily',
        'fontSize',
        'fontStyle',
        'fontWeight',
        'textDecoration'
    ]

All I want is to remove fontSize really, someone else may want to remove only color etc.. maybe it would be best to add an option to specify which styling we want to keep or omit?

fenopa avatar Jan 14 '24 02:01 fenopa