edx-dl icon indicating copy to clipboard operation
edx-dl copied to clipboard

Download SSL Error with Proxy in China

Open qu123xxx opened this issue 5 years ago • 3 comments

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

Subject of the issue

edx-video.net is banned by GFW.So we can't download it directly. If I use proxy software to open the ‘Global Proxy’ mode it will happen to SSLerror(bad request). And the requests.get() method always can't download the completed videos.

Your environment

  • Operating System : Win10 64bit
  • Python version: 3.6
  • youtube-dl version: latest
  • edx-dl version: latest

Steps to reproduce

just in Chinese net field download any lessons in normal way with '-i' . The error code will happen

Solution:

I modified edx_dl.py—>def download_url(url, filename, headers, args): like this ` def download_url(url, filename, headers, args): """ Downloads the given url in filename. """

if is_youtube_url(url):
    download_youtube_url(url, filename, headers, args)
else:
    import ssl
    import requests
    # FIXME: Ugly hack for coping with broken SSL sites:
    # https://www.cs.duke.edu/~angl/papers/imc10-cloudcmp.pdf
    #
    # We should really ask the user if they want to stop the downloads
    # or if they are OK proceeding without verification.
    #
    # Note that skipping verification by default could be a problem for
    # people's lives if they happen to live ditatorial countries.
    #
    # Note: The mess with various exceptions being caught (and their
    # order) is due to different behaviors in different Python versions
    # (e.g., 2.7 vs. 3.4).
    try:
        # mitxpro fix for downloading compressed files
        if 'zip' in url and 'mitxpro' in url:
            urlretrieve(url, filename)
        else:
            proxies = { "http": "socks5://127.0.0.1:1080", "https": "socks5://127.0.0.1:1080", }
            #use socks proxy, http proxy will happen to bad request.
            pre_content_length=0
            #use this cycle because run the request.get() 、file.write() one time will lead to un completely download.
            while True: 
                if os.path.exists(filename):
                    headers['Range'] = 'bytes=%d-' % os.path.getsize(filename)
                with requests.get(url, headers=headers,proxies=proxies,stream=True) as r:
                    r.raise_for_status()
                    content_length = int(r.headers['content-length'])
                    print(content_length)
                    if content_length < pre_content_length or (
                    os.path.exists(filename) and os.path.getsize(filename) == content_length) or content_length == 0:
                        break
                    pre_content_length = content_length
                    with open(filename, 'wb') as fp:
                        for chunk in r.iter_content(chunk_size=52428800):        #52428800
                            if chunk: # filter out keep-alive new chunks
                                fp.write(chunk)
                                print('download success,file size : %d   total size:%d' %(os.path.getsize(filename), content_length))                
            #total_size = int(r.headers['Content-Length'])
    except Exception as e:
        logging.warn('Got SSL/Connection error: %s', e)
        if not args.ignore_errors:
            logging.warn('Hint: if you want to ignore this error, add '
                            '--ignore-errors option to the command line')
            raise e
        else:
            logging.warn('SSL/Connection error ignored: %s', e)`

This code can download videos from edx-video.net succefully.

but because of my poor code skill I can't make it easy to set proxy when excuting the command line.

Hope somebody can do it.

ref: https://juejin.im/post/5c331483e51d455246489a25 https://stackoverflow.com/questions/23645212/requests-response-iter-content-gets-incomplete-file-1024mb-instead-of-1-5gb

qu123xxx avatar Jan 08 '20 14:01 qu123xxx

I find the code still sometimes can't download the whole video by requests.

qu123xxx avatar Jan 10 '20 14:01 qu123xxx

用https://ping.chinaz.com/ 查找获得该域名的真实地址(延迟合理的那个),修改hosts后应该不会出现证书错误

Oshibuki avatar Feb 26 '20 07:02 Oshibuki

谢谢你的回复,我改了下代码,再加上挂梯子成功地把视频下全了。

用https://ping.chinaz.com/ 查找获得该域名的真实地址(延迟合理的那个),修改hosts后应该不会出现证书错误

qu123xxx avatar Feb 27 '20 01:02 qu123xxx