python-goose
python-goose copied to clipboard
HtmlFetcher does not handle gzip compression
Some servers force gzip compression on their content, which HtmlFetcher does not deal gracefully with because urllib2 assumes non-compressed content. Cheapest/easiest solution would be to check the encoding header on the response and decompress with zlib
if it's gzipped. More ambitious/heavy solution would be to move over to something like requests rather than urllib2.
Requests: 72929331d44309f9002ae0dd3cd268cfddb0e43f
Awesome! Let's hope it gets merged.