WebVTT: support styled cues and voice spans

Open mbklein opened this issue 1 year ago • 0 comments

This PR adds support for advanced WebVTT features such as text formatting, CSS styling, voice spans, and Ruby characters.

To do so, I had to switch VTT parsing from node-webvtt to vtt.js. vtt.js is older, but neither of them has seen significant development in years, so I'm going with the theory that the lack of activity means more "stable" and less "abandoned."

The main advantage of this approach is access to the WebVTT.convertCueToDOMTree() method, which converts each WebVTT cue to a DOM DocumentFragment according to the guidelines laid out in Section 6.5 of the WebVTT W3C recommendation. This allows the consuming application to style cues according to class or voice in addition to stripping out any other tags that shouldn't be there.

Behavior for untagged cues is unchanged.

Example

Cue

00:00:02.320 --> 00:00:04.062
<v interviewer>How did you meet your wife?</v>

Before

&lt;v interviewer&gt;How did you meet your wife?&lt;/v&gt;
<strong>0:02</strong>

SCR-20240606-nyki

After (with styling applied to the resulting `.webvtt-cue [title="..."]` selectors)

<div class="webvtt-cue">
  <span title="interviewer">How did you meet your wife?</span>
</div>
<strong>0:02</strong>

SCR-20240606-nxii

Jun 06 '24 20:06 mbklein

WebVTT: support styled cues and voice spans

Example

Cue

Before

After (with styling applied to the resulting .webvtt-cue [title="..."] selectors)

After (with styling applied to the resulting `.webvtt-cue [title="..."]` selectors)