hls.js icon indicating copy to clipboard operation
hls.js copied to clipboard

Separate subtitle and closed captions displaying

Open gizomo opened this issue 9 months ago • 2 comments

What version of Hls.js are you using?

1.6.2

What browser (including version) are you using?

Firefox 137.0.2 x64, Chrome 132.0.6834.159 x64

What OS (including version) are you using?

Kubuntu 24.04

Test stream

https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8

Configuration

{
  debug: false,
  maxBufferLength: 60,
  maxMaxBufferLength: 600,
  appendErrorMaxRetry: 6,
  enableWorker: Boolean(window.Worker)
}

Additional player setup steps

No response

Checklist

  • [x] The issue observed is not already reported by searching on Github under https://github.com/video-dev/hls.js/issues
  • [x] The issue occurs in the stable client (latest release) on https://hlsjs.video-dev.org/demo and not just on my page
  • [x] The issue occurs in the latest client (main branch) on https://hlsjs-dev.video-dev.org/demo and not just on my page
  • [x] The stream has correct Access-Control-Allow-Origin headers (CORS)
  • [x] There are no network errors such as 404s in the browser console when trying to play the stream

Steps to reproduce

  1. Open test stream
  2. Wait to load subtitles
  3. Try to get hls.allSubtitleTracks or hls.subtitleTracks
  4. Try select "English" subtitle (hls.subtitleTrack = 0)

Expected behaviour

We expect hls.allSubtitleTracks or hls.subtitleTracks return array both all subtitles and closed captions from parsed playlist. We also expect to have a possibility to select subtitles and closed captions to display it separately.

What actually happened?

The playlist contains subtitles in WebVTT format and closed captions in CEA-608 format.

#EXT-X-MEDIA:TYPE=CLOSED-CAPTIONS,GROUP-ID="cc1",LANGUAGE="en",NAME="English",AUTOSELECT=YES,DEFAULT=YES,INSTREAM-ID="CC1"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="sub1",LANGUAGE="en",NAME="English",AUTOSELECT=YES,DEFAULT=YES,FORCED=NO,URI="s1/en/prog_index.m3u8"

After subtitle tracks has been loaded hls.allSubtitleTracks and hls.subtitleTracks include only one track group "sub1" and only one "English" subtitle in it. There are no "cc1" track group and any closed caption tracks.

If we set hls.subtitleTrack to 0 (this is an index of "English" subtitle from "sub1" text group) player displays both "English" subtitle and "English" closed caption.

Image

We guess if the subtitles and closed captions have the same attributes, then by default hls.js does not create a separate track for closed caption and displays both subtitle and closed caption at the same time.

Opening test stream on https://hlsjs.video-dev.org/demo or on https://hlsjs-dev.video-dev.org/demo shows the same behavior. But HTML5 Video tag controls has two subtitle options: one for subtitle and another for closed caption. And it works correctly.

Image

Console output

[log] > [transmuxer.ts]: Flushed audio sn: 8 of track 0 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Parsed audio sn: 8 of track 0 (frag:[41.919-47.913]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Loaded main sn: 8 of level 7 base-stream-controller.ts:512:15
[log] > [stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed main sn: 8 of level 7 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: Buffered audio sn: 8 of track 0 (frag:[41.919-47.913] > buffer:[0.000-47.913]) base-stream-controller.ts:708:9
[log] > [audio-stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loading audio sn: 9 of track 0 (frag:[47.913-53.908]) cc: 0 [1-101], target: 47.913 base-stream-controller.ts:899:9
[log] > [audio-stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [stream-controller]: Parsed main sn: 8 of level 7 (frag:[42.000-48.000]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Buffered main sn: 8 of level 7 (frag:[42.000-48.000] > buffer:[0.033-48.017]) base-stream-controller.ts:708:9
[log] > [stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [stream-controller]: Loading main sn: 9 of level 7 (frag:[48.000-54.000]) cc: 0 [1-100], target: 48.017 base-stream-controller.ts:899:9
[log] > [stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loaded audio sn: 9 of track 0 base-stream-controller.ts:512:15
[log] > [audio-stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed audio sn: 9 of track 0 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Parsed audio sn: 9 of track 0 (frag:[47.913-53.908]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Loaded main sn: 9 of level 7 base-stream-controller.ts:512:15
[log] > [stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed main sn: 9 of level 7 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: Buffered audio sn: 9 of track 0 (frag:[47.913-53.908] > buffer:[0.000-53.908]) base-stream-controller.ts:708:9
[log] > [audio-stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loading audio sn: 10 of track 0 (frag:[53.908-59.903]) cc: 0 [1-101], target: 53.908 base-stream-controller.ts:899:9
[log] > [audio-stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [stream-controller]: Parsed main sn: 9 of level 7 (frag:[48.000-54.000]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Buffered main sn: 9 of level 7 (frag:[48.000-54.000] > buffer:[0.033-54.017]) base-stream-controller.ts:708:9
[log] > [stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [stream-controller]: Loading main sn: 10 of level 7 (frag:[54.000-60.000]) cc: 0 [1-100], target: 54.017 base-stream-controller.ts:899:9
[log] > [stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loaded audio sn: 10 of track 0 base-stream-controller.ts:512:15
[log] > [audio-stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed audio sn: 10 of track 0 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Parsed audio sn: 10 of track 0 (frag:[53.908-59.903]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Loaded main sn: 10 of level 7 base-stream-controller.ts:512:15
[log] > [stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed main sn: 10 of level 7 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: Buffered audio sn: 10 of track 0 (frag:[53.908-59.903] > buffer:[0.000-59.903]) base-stream-controller.ts:708:9
[log] > [audio-stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loading audio sn: 11 of track 0 (frag:[59.903-65.897]) cc: 0 [1-101], target: 59.903 base-stream-controller.ts:899:9
[log] > [audio-stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [stream-controller]: Parsed main sn: 10 of level 7 (frag:[54.000-60.000]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Buffered main sn: 10 of level 7 (frag:[54.000-60.000] > buffer:[0.033-60.017]) base-stream-controller.ts:708:9
[log] > [stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [stream-controller]: Loading main sn: 11 of level 7 (frag:[60.000-66.000]) cc: 0 [1-100], target: 60.017 base-stream-controller.ts:899:9
[log] > [stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loaded audio sn: 11 of track 0 base-stream-controller.ts:512:15
[log] > [audio-stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed audio sn: 11 of track 0 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Parsed audio sn: 11 of track 0 (frag:[59.903-65.897]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Loaded main sn: 11 of level 7 base-stream-controller.ts:512:15
[log] > [stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed main sn: 11 of level 7 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: Buffered audio sn: 11 of track 0 (frag:[59.903-65.897] > buffer:[0.000-65.897]) base-stream-controller.ts:708:9
[log] > [audio-stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loading audio sn: 12 of track 0 (frag:[65.897-71.892]) cc: 0 [1-101], target: 65.897 base-stream-controller.ts:899:9
[log] > [audio-stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [stream-controller]: Parsed main sn: 11 of level 7 (frag:[60.000-66.000]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Buffered main sn: 11 of level 7 (frag:[60.000-66.000] > buffer:[0.033-66.017]) base-stream-controller.ts:708:9
[log] > [stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [stream-controller]: Loading main sn: 12 of level 7 (frag:[66.000-72.000]) cc: 0 [1-100], target: 66.017 base-stream-controller.ts:899:9
[log] > [stream-controller]: IDLE->FRAG_LOADING base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Loaded audio sn: 12 of track 0 base-stream-controller.ts:512:15
[log] > [audio-stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed audio sn: 12 of track 0 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [audio-stream-controller]: Parsed audio sn: 12 of track 0 (frag:[65.897-71.892]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Loaded main sn: 12 of level 7 base-stream-controller.ts:512:15
[log] > [stream-controller]: FRAG_LOADING->PARSING base-stream-controller.ts:2093:11
[log] > [transmuxer.ts]: Flushed main sn: 12 of level 7 transmuxer-interface.ts:396:40
[log] > [audio-stream-controller]: Buffered audio sn: 12 of track 0 (frag:[65.897-71.892] > buffer:[0.000-71.892]) base-stream-controller.ts:708:9
[log] > [audio-stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11
[log] > [stream-controller]: PARSING->PARSED base-stream-controller.ts:2093:11
[log] > [stream-controller]: Parsed main sn: 12 of level 7 (frag:[66.000-72.000]) base-stream-controller.ts:2040:9
[log] > [stream-controller]: Buffered main sn: 12 of level 7 (frag:[66.000-72.000] > buffer:[0.033-72.017]) base-stream-controller.ts:708:9
[log] > [stream-controller]: PARSED->IDLE base-stream-controller.ts:2093:11

Chrome media internals output


gizomo avatar May 13 '25 04:05 gizomo

Does changing the NAME attribute of one of the options workaround the issue (NAME="English CC" for example)?

robwalch avatar May 13 '25 15:05 robwalch

@robwalch Changing NAME attribute makes no sense because we don't always have such an opportunity (third party streams for instance). We made the assumption about the coincidence of names only on the basis of how subtitle-track-controller and timeline-controller work. If we understood correctly сlosed captions are not parsed as separate MediaPlaylist during manifest loading but it created only on the fly if CC has the same name as some subtitle MediaPlaylist selected

gizomo avatar May 14 '25 02:05 gizomo

hls.allSubtitleTracks and hls.subtitleTracks only list subtitle tracks. Captions tracks are not exposed in this interface. If you want to select between them, use the HTMLMediaElement.textTracks interface:

Image

Set one of the tracks to "showing" and all others to "disabled":

hls.media.textTracks[1].mode = 'showing';
hls.media.textTracks[0].mode = 'disabled';

robwalch avatar Jun 23 '25 18:06 robwalch