grab
grab copied to clipboard
grab/pycurl fails to parse cookies
Content of `grab.response.head in the moment of error happened:
b'HTTP/1.1 200 OK\r\nDate: Tue, 13 Jun 2017 22:16:36 GMT\r\nServer: Apache\r\nSet-Cookie: \xb3\xd2\xda\xcd\xd7=%96%A6g%9Ay%B0%A5g%A7tm%7C%95%9A; expires=Tue, 25-Jul-2017 14:16:36 GMT; path=/\r\nX-Powered-By: Apache2\r\nVary: Accept-Encoding\r\nContent-Encoding: gzip\r\nContent-Length: 4974\r\nContent-Type: text/html\r\n\r\n'
Error log:
Traceback (most recent call last):
File "/home/web/web/netrank/.env/src/grab/grab/transport/curl.py", line 517, in prepare_response
response.cookies = CookieManager(self.extract_cookiejar())
File "/home/web/web/netrank/.env/src/grab/grab/transport/curl.py", line 555, in extract_cookiejar
for line in self.curl.getinfo(pycurl.INFO_COOKIELIST):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb3 in position 43: invalid start byte
I've reported bug to pycurl project: https://github.com/pycurl/pycurl/issues/493
Possible solution: fix invalid cookies in header_processor function in grab.transport.curl