newspaper
newspaper copied to clipboard
There seem to be complaints related to the user agent scraping permission issue
Hello,
I think quite a lot of people seem to have created issues similar to this one. I solved my problem with the user agent trick (I was not allowed to scrape the contents of a website, for whatever reason, and the result of article.html
was basically an empty string).
Either way, I found out that the solution is to use a Config
object as a parameter to the Article
class, with the browser_user_agent
set to something like Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0
. I'm wondering if this detail should be added to the main README.md file or not. I'm convinced that this will be helpful and will save a lot of time for other people.
Thank you.