WKZombie
WKZombie copied to clipboard
URL unable to parse/Bypass robots.txt
I have an https url that isn't able to parse. Using other methods, I've needed to bypass robots.txt, but it does not seem exist any setting for this in WKZombie?
Hi @hugolundin No, currently there's no such setting. What are you trying to accomplish? Maybe changing the user agent or adjusting the http headers might help?
I am trying to parse a website for some urls. It has worked fine using selenium with PhantomJS, and also with Mechanize in Python, but when I try doing it with WKZombie, the website loads until it logs "Unable to parse". The reason I thought about robots.txt was because Mechanize complained about it before I activated their setting to bypass it.
Do you have any suggestions in what way there are common to change user agent and/or the http headers? Thank you very much for your reply!