article-extracting topic

List article-extracting repositories

ftr-site-config

351
Stars
254
Forks
Watchers

Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.

NewsCatchr

216
Stars
51
Forks
Watchers

FOSS Android News Reader App

SmartReader

149
Stars
34
Forks
Watchers

SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla

markdown_articles_tool

108
Stars
24
Forks
Watchers

Parse markdown article, download images and replace images URL's with local paths

article-parser

43
Stars
6
Forks
Watchers

Extract article or news by url or html, parse the title and content, output in markdown format.

newshound

29
Stars
3
Forks
Watchers

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.

dnlp

17
Stars
5
Forks
Watchers

πŸ“š Π‘Π±ΠΎΡ€Π½ΠΈΠΊ ΠΏΠΎΠ»Π΅Π·Π½Ρ‹Ρ… ΡˆΡ‚ΡƒΠΊ ΠΈΠ· Natural Language Processing: ΠžΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½ΠΈΠ΅ языка тСкста, Π Π°Π·Π΄Π΅Π»Π΅Π½ΠΈΠ΅ тСкста Π½Π° прСдлоТСния, ΠŸΠΎΠ»ΡƒΡ‡Π΅Π½ΠΈΠ΅ основного содСрТимого ΠΈΠ· html Π΄ΠΎΠΊΡƒΠΌΠ΅Π½Ρ‚Π°