crawlee-python icon indicating copy to clipboard operation
crawlee-python copied to clipboard

SSL certificate error

Open Pijukatel opened this issue 7 months ago • 2 comments

Crawlers are failing with httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007) . Investigate and fix. Probably some dependency update caused this issue.

[BeautifulSoupCrawler] ERROR Request failed and reached maximum retries
      Traceback (most recent call last):
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
          yield
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 394, in handle_async_request
          resp = await self._pool.handle_async_request(req)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 256, in handle_async_request
          raise exc from None
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 236, in handle_async_request
          response = await connection.handle_async_request(
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
          raise exc
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py", line 78, in handle_async_request
          stream = await self._connect(request)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_async/connection.py", line 156, in _connect
          stream = await stream.start_tls(**kwargs)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 67, in start_tls
          with map_exceptions(exc_map):
        File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
          self.gen.throw(typ, value, traceback)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
          raise to_exc(exc) from exc
      httpcore.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/crawlee/crawlers/_basic/_context_pipeline.py", line 68, in __call__
          result = await middleware_instance.__anext__()
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/crawlee/crawlers/_abstract_http/_abstract_http_crawler.py", line 255, in _make_http_request
          result = await self._http_client.crawl(
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/crawlee/http_clients/_httpx.py", line 159, in crawl
          response = await client.send(http_request)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1629, in send
          response = await self._send_handling_auth(
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1657, in _send_handling_auth
          response = await self._send_handling_redirects(
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1694, in _send_handling_redirects
          response = await self._send_single_request(request)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1730, in _send_single_request
          response = await transport.handle_async_request(request)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/crawlee/http_clients/_httpx.py", line 60, in handle_async_request
          response = await super().handle_async_request(request)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 393, in handle_async_request
          with map_httpcore_exceptions():
        File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
          self.gen.throw(typ, value, traceback)
        File "/.../beautifulsoup_crawler_py/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
          raise mapped_exc(message) from exc
      httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)

Pijukatel avatar May 01 '25 07:05 Pijukatel

Caused by bump of python-certifi from 2025.1.31 to certifi 2025.4.26 https://github.com/certifi/python-certifi/issues/349 Let's pin python-certifi to 2025.1.31 for now and wait for the upstream issue to be fully resolved.

Pijukatel avatar May 01 '25 07:05 Pijukatel

Close this after the upstream issue is resolved and certifi can be removed from the dependencies again

Pijukatel avatar May 02 '25 09:05 Pijukatel