article-extractor topic

List article-extractor repositories

trafilatura

3.0k
Stars
228
Forks
Watchers

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

paperoni

126
Stars
5
Forks
Watchers

An article extractor in Rust

article-extractor

1.4k
Stars
131
Forks
Watchers

To extract main article from given URL with Node.js

php-goose

456
Stars
120
Forks
Watchers

Readability / Html Content / Article Extractor & Web Scrapping library written in PHP

SmartReader

149
Stars
34
Forks
Watchers

SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla

markdown_articles_tool

108
Stars
24
Forks
Watchers

Parse markdown article, download images and replace images URL's with local paths

sneakpeek

102
Stars
17
Forks
Watchers

Reddit bot to preview and post hyperlinks as comments

newshound

29
Stars
3
Forks
Watchers

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.