langchain icon indicating copy to clipboard operation
langchain copied to clipboard

update web_base.py to have verify option

Open jackfrost1411 opened this issue 2 years ago • 2 comments

We propose an enhancement to the web-based loader initialize method by introducing a "verify" option. This enhancement addresses the issue of SSL verification errors encountered on certain web pages. By providing users with the option to set the verify parameter to False, we offer greater flexibility and control.

Who can review?

@eyurtsev

jackfrost1411 avatar Jun 13 '23 19:06 jackfrost1411

And by adding verify option: you can finally pass in headers such as

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}

to bypass the SSL verification.

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
loader = WebBaseLoader(web_path="https://SO_AND_SO.com", header_template=headers, verify=False)
data = loader.load()

This solves a lot of issues that I faced in the recent past.

jackfrost1411 avatar Jun 13 '23 20:06 jackfrost1411

The older version of web_base.py gives errors: image

The newer version of web_base.py is working just fine: Screen Shot 2023-06-13 at 1 27 12 PM

jackfrost1411 avatar Jun 13 '23 20:06 jackfrost1411

@hwchase17 is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Jun 17 '23 18:06 vercel[bot]