html-parser topic

List html-parser repositories

NSoup

154
Stars
49
Forks
Watchers

NSoup is a .NET port of the jsoup (http://jsoup.org) HTML parser and sanitizer originally written in Java

jusText

688
Stars
78
Forks
Watchers

Heuristic based boilerplate removal tool

html5parser

173
Stars
25
Forks
Watchers

A super tiny and fast html5 AST parser.

floki

2.0k
Stars
153
Forks
Watchers

Floki is a simple HTML parser that enables search for nodes using CSS selectors.

save-for-offline

139
Stars
44
Forks
Watchers

Android app for saving webpages for offline reading.

Modest

719
Stars
65
Forks
Watchers

Modest is a fast HTML renderer implemented as a pure C99 library with no outside dependencies.

myhtml

1.6k
Stars
146
Forks
Watchers

Fast C/C++ HTML 5 Parser. Using threads.

HtmlMonkey

51
Stars
9
Forks
Watchers

Lightweight HTML/XML parser written in C#.

AdvancedHTMLParser

97
Stars
26
Forks
Watchers

Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.

jodd

4.1k
Stars
724
Forks
Watchers

Jodd! Lightweight. Java. Zero dependencies. Use what you like.