newspaper Does not fetch arabic news

Hello, I tried it but it did not fetch Arabic news such as https://www.alarabiya.net/ I got zero article.

My code:

news_paper = newspaper3k.build('https://www.alarabiya.net/', language='ar', memoize_articles=False)

Jan 16 '21 12:01 moh55m55

Newspaper will obtain article information from the target website, but it requires additional code to bypass the "accept all cookies" prompt which has to be clicked. Take a look at the examples on my newspaper3 usage overview document.

Jan 16 '21 23:01 johnbumgarner

I reviewed the examples but did not figure out how to bypass the cookies. I appreciate your help

Jan 17 '21 18:01 ghost

The overview talks about using selenium to bypass the "accept all cookies" prompt on website that require you to click them before accessing content. I will look into writing an example for https://www.alarabiya.net, but it will take a couple of days, before I can get to it and update the overview document.

Jan 17 '21 20:01 johnbumgarner

Sounds great. I appreciate it.

Jan 17 '21 20:01 ghost

I added a scraping example in my Newspaper overview document for the Al Arabiya website. Please note that I didn't build an entire solution for you. All the info to finish the code is in my overview document, which you can add to the other code yourself. Additionally, you will need to determine what urls are important to you, because I don't read Arabic, so it's hard for me to pick the correct items. Good luck.

P.S. Don't forget to close this issue, because it has been solved.

Jan 21 '21 22:01 johnbumgarner

Sounds great. I appreciate it.

@moh55m55 have you tested my code that I posted on 01-21-2021.

Apr 15 '21 20:04 johnbumgarner

newspaper newspaper copied to clipboard

Does not fetch arabic news

newspaper
newspaper copied to clipboard