infoq-downloader
infoq-downloader copied to clipboard
Download InfoQ presentations, including video & slides, to play them offline
``` C:\Users\Bob\Downloads>infoq_downloader.exe http://www.infoq.com/presentations/testng-groovy-spock Downloading HTML file Traceback (most recent call last): File "", line 35, in IndexError: list index out of range ```
Hi, I'm getting an error when attempting to download a presentation (@Windows): `C:\Users\me\Desktop>infoq_downloader.exe https://www.infoq.com/presentations/panel-business-data-science` `Downloading HTML file Traceback (most recent call last): File "", line 32, in ` `File "F:\tayseer\projects\infoq-downloader-player\build\infoq_downloader\out00-PYZ.pyz\requests.api",...
C:\Users\Dimitry\Downloads>infoq_downloader.exe http://www.infoq.com/presentations/responsive-design Downloading HTML file Traceback (most recent call last): File "", line 35, in IndexError: list index out of range
here is the error: requests.exceptions.MissingSchema: Invalid URL '//d1snlc0orfrhj.cloudfront.net/presentations-br/qconsp2015-LucianoRamalho-Python.mp4': No schema supplied. Perhaps you meant http:////d1snlc0orfrhj.cloudfront.net/presentations-br/qconsp2015-LucianoRamalho-Python.mp4? this worked like a charm: video_url = 'http:%s' % video_url
Fix the bugs: Fix capture video URL, removing double slashs of URI (e.g.: http://// ); Fix presentation directory, with unicode.
my workaround suggestion
try downloading http://www.infoq.com/presentations/aeron-messaging
$ python infoq_downloader.py http://www.infoq.com/presentations/db-history-data-processing\?utm_source\=infoq\&utm_medium\=videos_homepage\&utm_campaign\=videos_row2 Downloading HTML file Traceback (most recent call last): File "infoq_downloader.py", line 80, in if not os.path.exists('{}/slides'.format(presentation_directory)): UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 31:...
The download of slides stopped working a little while ago. It seems the reason was a change in the URL or InfoQ's CDN (?), which caused the regex that extracts...
According to issue #17, this change removes invalid folder name characters for Microsoft Windows which includes \/:*?"| characters.