Results 3 repositories owned by Cherokee

arex

30
Stars
10
Forks
Watchers

node.js article extractor, automatic summarization.

neocrawler

151
Stars
100
Forks
Watchers

Nodejs Crawler, including schedule, spider, web ui config, proxy modules. using nodejs, redis/ssdb, hbase, phantomjs. css selector extraction rules and regex extraction rules supported.

node-article-extractor

19
Stars
2
Forks
Watchers

Automatically extract body content (and other cool stuff) from an html document. based on https://github.com/ageitgey/node-unfluff, but support Chinese.