langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Youtube.py: urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)>

Open AaronWard opened this issue 1 year ago • 2 comments

The Problem

The YoutubeLoader is breaking when using the from_youtube_url function. The expected behaviour is to use this module to get transcripts from youtube videos and pass into them to an LLM. Willing to help if needed.

Specs

- Machine: Apple M1 Pro
- Version: langchain  0.0.147
- conda-build version : 3.21.8
- python version : 3.9.12.final.0

Code

from dotenv import find_dotenv, load_dotenv
from langchain.document_loaders import YoutubeLoader

load_dotenv(find_dotenv())

loader = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=QsYGlZkevEg", add_video_info=True)
result = loader.load()
print (result)

Output

Traceback (most recent call last):
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/__main__.py", line 341, in title
    self._title = self.vid_info['videoDetails']['title']
KeyError: 'videoDetails'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 1346, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 1285, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 1331, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 1280, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 1040, in _send_output
    self.send(msg)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 980, in send
    self.connect()
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/Desktop/personal/github/ar-assistant/notebooks/research/langchain/scripts/5-indexes.py", line 28, in <module>
    result = loader.load()
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/langchain/document_loaders/youtube.py", line 133, in load
    video_info = self._get_video_info()
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/langchain/document_loaders/youtube.py", line 174, in _get_video_info
    "title": yt.title,
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/__main__.py", line 345, in title
    self.check_availability()
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/__main__.py", line 210, in check_availability
    status, messages = extract.playability_status(self.watch_html)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/__main__.py", line 102, in watch_html
    self._watch_html = request.get(url=self.watch_url)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/request.py", line 53, in get
    response = _execute_request(url, headers=extra_headers, timeout=timeout)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/site-packages/pytube/request.py", line 37, in _execute_request
    return urlopen(request, timeout=timeout)  # nosec
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 517, in open
    response = self._open(req, data)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 534, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 1389, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Users/<username>/opt/anaconda3/envs/dev/lib/python3.9/urllib/request.py", line 1349, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)>

FYI

  • There is a duplication of code excerpts in the Youtube page of the langchain docs

AaronWard avatar Apr 23 '23 13:04 AaronWard

Issue appears to be coming from langchain.document_loaders.youtube._get_video_info when the add_video_info arg is set to True.

AaronWard avatar Apr 23 '23 14:04 AaronWard

Hey @AaronWard

Give this a try. It worked for me.

https://stackoverflow.com/questions/68275857/urllib-error-urlerror-urlopen-error-ssl-certificate-verify-failed-certifica

Let me know if it works.

shawnesquivel avatar May 07 '23 01:05 shawnesquivel

Hi, @AaronWard! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you reported an issue with the YoutubeLoader in the langchain library encountering an SSL certificate verification error on an Apple M1 Pro machine. It seems that the issue is related to the langchain.document_loaders.youtube._get_video_info function when the add_video_info argument is set to True. AlfredOdling provided a potential solution, and shawnesquivel suggested trying a solution from Stack Overflow.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain project!

dosubot[bot] avatar Sep 17 '23 17:09 dosubot[bot]