newspaper issues

Results 152 newspaper issues

Sort by recently updated

Project status

Is this project still maintained? I see a lot of Pull Requests and the last commit to the code was Sep 2, 2020

The publication date of [this article](https://bmcwomenshealth.biomedcentral.com/articles/10.1186/s12905-022-02136-8) is reported as 2023-12-10. This is impossible, as the article is downloaded on 2023-03-10. The article lists it's own publication date as 2023-01-02. Further...

inspectorG4dget

Allow newspaper work on news websites like carbon-pulse

Sorry for the unrelated commits, but I just found how to create a PR but not for a specific or set of commits :(.

Cabu

Getting Older News Articles

Hello, seven years ago this was posted: https://github.com/codelucas/newspaper/issues/245 I have a problem that requires me to scrape a large corpus of titles from 2013-2019 from various news sources. Ideally I...

PaulKMandal

Consider switching from lxml's clean_html for enhanced security (and possibly performance)

I'd like to bring to your attention that we are [discussing](https://bugs.launchpad.net/lxml/+bug/1958539) the possibility of removing lxml's clean_html functionality from lxml library. Over the past years, there have been several concerning...

frenzymadness

TIPS FOR FAST IMPROVEMENT

I have extracted some meta tags, you can try to identify title, text, description and date by replacing provided tags in : meta[property='{}'] meta[name='{}'] meta[itemprop='{}'] Meta tags for publication and...

aleksandar-devedzic

Fix xpath selector for extracting feeds

For https://github.com/codelucas/newspaper/issues/731 Also added a test case based on a mock already included in the test data fixtures. Adapting the test code from the issue I created: ``` import newspaper...

sirpengi

not working for gnews.org

https://gnews.org/articles/1068907 used `article.text` for this page, and no text got. and build for gnews is not working too. ```python import newspaper gnews = newspaper.build('https://gnews.org/', language='zh') article = gnews.articles[0] article.download() article.parse()...

Jooey233

the API doesn't work

Hi all. I was using newspaper3k and it was working fine, but today it stopped working and returns empty text. Does anyone have any ideas?

androidAppMe

download() halts/stuck forever with a specific URL

The following doesn't timeout nor return anything. ``` url = "http://http-live.sr.se/srextra01-mp3-192" article = newspaper.Article(url, request_timeout=5) article.download() ``` Same with: ``` from newspaper.network import get_html_2XX_only article.config.__dict__ {'MIN_WORD_COUNT': 300, 'MIN_SENT_COUNT': 7, 'MAX_TITLE':...

KeremTurgutlu

newspaper
newspaper copied to clipboard

Metadata

Project status

Date extraction is faulty

Allow newspaper work on news websites like carbon-pulse

Getting Older News Articles

Consider switching from lxml's clean_html for enhanced security (and possibly performance)

TIPS FOR FAST IMPROVEMENT

Fix xpath selector for extracting feeds

not working for gnews.org

the API doesn't work

download() halts/stuck forever with a specific URL

← Metadata

Owner

Metadata

newspaper newspaper copied to clipboard

Metadata

← Metadata

Owner

Metadata

newspaper
newspaper copied to clipboard