yle-dl icon indicating copy to clipboard operation
yle-dl copied to clipboard

Feature request : Subtitles-only ( #254 )

Open dimetime opened this issue 1 year ago • 6 comments

#254 TL;DR - Apparently downloads from Areena download everything, not cherrypicking separate streams?

Thus, if downloading everything is only option, could the video/audio streams be directed to /dev/null or something similar whilst downloading?

Or --subtitles-only to implement ---backend wget and remove any downloaded video/audio-data?

dimetime avatar May 06 '23 12:05 dimetime

Yle-dl indeed always downloads all streams.

If you don't mind temporarily downloading the video file, you can create a postprocessing script that deletes the video file after downloading it (keeping just the substitles file).

Create a new file called keep-only-subtitles:

#!/bin/sh

# Remove the video file
# $1 is the name of the downloaded video file
# ($2 is the subtitles file)
rm "$1"

Then tell yle-dl to execute a postprocessing step after the download is complete:

yle-dl --backend wget --postprocess ./keep-only-subtitles https://areena.yle.fi/1-61825068

This assumes that you are calling this from the directory where keep-only-subtitles is located.

aajanki avatar May 10 '23 18:05 aajanki

To save a large amount of bandwidth (above 99%), is it viable to implement downloading only subtitles? For reference, youtube-dl and yt-dlp support the feature.

ghost avatar Aug 18 '23 05:08 ghost

Can you get the subtitles as extracted, rather than embedded in the .mkv-file at all with yle-dl? At least I didn't see the option in the docs quickly glanced.

IlmariKu avatar Jun 15 '24 12:06 IlmariKu

It's not possible to download just the subtitles with yle-dl.

However, you can download the .mkv file and the extract the subtitles yourself using a tool such as ffmpeg.

aajanki avatar Jun 15 '24 14:06 aajanki

But, I have a question @aajanki, when I ran the command --showmetadata, it gave me the finnish subs as one of the urls. Is this an exception on how the subs are available? I haven't checked how the code works, but I'm guessing the metadata (and the CDN link) is available without the video?

Reproduce: yle-dl https://areena.yle.fi/1-1414632 --with-metadata

duration_seconds": 1709,
    "subtitles": [
      {
        "language": "fin",
        "url": "https://cdnapi-legacy.kaltura.com/api_v3/index.php/service/caption_captionAsset/action/serve/captionAssetId/1_zknqo0g1/ks/MDc3NWI3MzcxOWE2ZWU4NDgyMzA1MWQ3NDhlMzlkNzEyZjhjYTBiNnwxOTU1MDMxOzE5NTUwMzE7MTcxODQ3NjU4OTswOzMwNTIzO292cEB5bGUuZmk7ZG93bmxvYWQ6MV85YjFrMmEzYw==",
        "category": "ohjelmatekstitys"
      }
    ],

IlmariKu avatar Jun 16 '24 08:06 IlmariKu

Even though --showmetadata sometimes lists subtitles, yle-dl never downloads these external subtitle files. The subtitle section in the metadata is left over from an earlier yle-dl version. Of course, you can download the subtitle files manually.

Areena provides these external subtitle files only on some videos. Recently published videos don't seem to contain those anymore.

aajanki avatar Jun 16 '24 16:06 aajanki