bedrock-claude-chat icon indicating copy to clipboard operation
bedrock-claude-chat copied to clipboard

[BUG] Lambda based Youtube video transcript download seems to be blocked

Open typex1 opened this issue 1 year ago • 2 comments

Describe the bug

Filling out this field will help us investigate the issue efficiently. Providing detailed information allows us to set the appropriate priority. We appreciate your cooperation.
A clear and concise description of what the bug is.

Creating a new Bot including a knowledge base from Youtube transcriptions fails. Error in Frontend: "Failed to detect language: Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Pv0cfsastFs"

This error message is misleading, because what seems to go wrong is not language detection specifically, but the whole transcript API seems to be not usable.

It took me quite some research to have good evidence that AWS owned IP addresses are (currently) blocked from Youtube transcription download. This applies at least to Lambda functions and Cloud9. Tested in us-east-1, eu-central-1 and ap-northeast-1.

To Reproduce

Filling out this field will help us investigate the issue efficiently. Providing detailed information allows us to set the appropriate priority. We appreciate your cooperation.
Steps to reproduce the behavior:

  • Create a new Bot, adding any arbitrary Youtube URL as part of the knowledge base. After the sync phase is done, it will show you the related error.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here. Screenshot 2024-08-13 at 14 26 53

typex1 avatar Aug 13 '24 12:08 typex1

Error situation on the Bot overview list: Screenshot 2024-08-13 at 14 47 41

typex1 avatar Aug 13 '24 12:08 typex1

The problem is caused by youtube-transcript-api library. Issues:

  • https://github.com/jdepoix/youtube-transcript-api/issues/293
  • https://github.com/jdepoix/youtube-transcript-api/issues/303

We may remove the youtube feature in the future for KnowledgeBase integration. Thank you for your understanding.

statefb avatar Aug 14 '24 01:08 statefb