requestium
requestium copied to clipboard
requestiumResponse.xpath don't respect to requests.Response' s encoding setting
s = Session(webdriver_path='./chromedriver', browser='chrome', default_timeout=15, webdriver_options={ 'arguments': ['headless'] }) a = s.get('https://www.baidu.com/') print(a.encoding) ####### after setting the encoding to utf-8 a.text is OK, but a.xpath is still unreadable code.. a.encoding = 'utf-8' ####### print(a.text) # here is OK print(a.xpath('//text()[normalize-space() and not(ancestor::script | ancestor::style)]') .extract()) # unrecognizable characters here.
Hello @tinylambda thanks for the issue report! We'll improve the integration between the encoding of the response and the encoding the xpath parser uses to fix these sorts of problems.
Fixed by https://github.com/tryolabs/requestium/commit/96bf7e5f457f7939ca0fc4a05669c6c3f2d5d539