newspaper
newspaper copied to clipboard
Does not fetch arabic news
Hello,
I tried it but it did not fetch Arabic news such as https://www.alarabiya.net/
I got zero article.
My code:
news_paper = newspaper3k.build('https://www.alarabiya.net/', language='ar', memoize_articles=False)
Newspaper will obtain article information from the target website, but it requires additional code to bypass the "accept all cookies" prompt which has to be clicked. Take a look at the examples on my newspaper3 usage overview document.
I reviewed the examples but did not figure out how to bypass the cookies. I appreciate your help
The overview talks about using selenium to bypass the "accept all cookies" prompt on website that require you to click them before accessing content. I will look into writing an example for https://www.alarabiya.net, but it will take a couple of days, before I can get to it and update the overview document.
Sounds great. I appreciate it.
I added a scraping example in my Newspaper overview document for the Al Arabiya website. Please note that I didn't build an entire solution for you. All the info to finish the code is in my overview document, which you can add to the other code yourself. Additionally, you will need to determine what urls are important to you, because I don't read Arabic, so it's hard for me to pick the correct items. Good luck.
P.S. Don't forget to close this issue, because it has been solved.
Sounds great. I appreciate it.
@moh55m55 have you tested my code that I posted on 01-21-2021.