llama-hub icon indicating copy to clipboard operation
llama-hub copied to clipboard

BilibiliTranscriptReader not working properly

Open pingzhili opened this issue 1 year ago • 2 comments

I've tried the given example url and some other urls, they all raise the same warning like this:

UserWarning: No subtitles found for video: https://www.bilibili.com/video/BV1Km4y1z723/. Return Empty transcript.
  warnings.warn(
[Document(text='', doc_id='9ea3b653-2696-468c-8586-045882f39530', embedding=None, doc_hash='dc937b59892604f5a86ac96936cd7ff09e25f18ae6b758e8014a24c7fa039e91', extra_info=None)]

Seems that the bilibili_api is not working properly? @AlexZhangji

pingzhili avatar Mar 31 '23 13:03 pingzhili

@pingzhili, thank you for bringing this up!

It seems that the video you shared may not have any subtitles or transcripts available. It appears that many recent uploads to the Bilibili no longer include subtitles by default. (previously, they have auto sub option enabled by default)

This code still works for videos that have subtitles. For example, using video from the same channel, which has subtitles available: https://www.bilibili.com/video/BV14P411T79F image

AlexZhangji avatar Apr 01 '23 01:04 AlexZhangji

After further tests, I did find that some videos do have subtitles originally, but for some reason, their transcripts are not displayed. Unfortunately, there is no official API available for this, and changes to the official website can sometimes cause issues.

I will investigate this matter further and see if we can find a solution. In the meantime, please feel free to try out other methods or contribute!

AlexZhangji avatar Apr 01 '23 01:04 AlexZhangji

it seems that this issue isn't directly related to llama-hub, closing it for now

EmanuelCampos avatar Sep 22 '23 11:09 EmanuelCampos