MusicBot icon indicating copy to clipboard operation
MusicBot copied to clipboard

Command for getting captions of the current entry

Open TheerapakG opened this issue 5 years ago • 0 comments

After creating your pull request, tick these boxes if they are applicable to you.

  • [x] I have tested my changes against the review branch (the latest developmental version), and this pull request is targeting that branch as a base
  • [x] I have tested my changes on Python 3.5/3.6

Description

This is a feature PR

This PR introduces a command that display caption belongs to the video. All captions are downloaded automatically along with the video (downloading a caption file everytime take times). However, there is a command to reload caption. When the command is invoked, the bot will try to load the caption file, parse it, and try to fix duplicating occurrences of the caption texts such as

... 15 00:00:28,810 --> 00:00:30,040 A refuge for failures to gather in 140 characters

16 00:00:30,040 --> 00:00:30,340 "Three" A refuge for failures to gather in 140 characters

17 00:00:30,340 --> 00:00:30,640 "Two" A refuge for failures to gather in 140 characters

18 00:00:30,640 --> 00:00:31,210 "One" A refuge for failures to gather in 140 characters

19 00:00:31,280 --> 00:00:33,650 With a game of Kagome Kagome, an uproar is stirred

20 00:00:33,650 --> 00:00:33,660 If you light a spark in a place without flame With a game of Kagome Kagome, an uproar is stirred

21 00:00:33,660 --> 00:00:35,380 If you light a spark in a place without flame ...

so that it will yield this result

... A refuge for failures to gather in 140 characters "Three" "Two" "One" With a game of Kagome Kagome, an uproar is stirred If you light a spark in a place without flame ...

Then send it using direct message because captions take up a lot of screen space so it's very unsuitable for it to be sent in the chat

From the testing, there might be some problem when Discord tries to show the caption with a lot of text in some language (such as Thai) as some part of the text disappearing. However, that is inevitable as it's problem of Discord and not the bot.

maybe I will do

discern if the Japanese subtitle contains both kanji and furigana. If so, add parenthesis (or Japanese quotation marks). This may require additional dependency (pykakasi) but I can make it optional. example: https://www.youtube.com/watch?v=Tq49NR_HzfY

some caption has multiple languages (transliteration of lyric together with translation). We might be able to use some simple heuristic to discern it (transliteration and translation have distinct differences such as one is in parenthesis) example: https://www.youtube.com/watch?v=ZEy36W1xX8c

Related issues (if applicable)

TheerapakG avatar Jan 10 '19 15:01 TheerapakG