html-parsing topic

List html-parsing repositories

interweave

1.1k
Stars
38
Forks
Watchers

🌀 React library to safely render HTML, filter attributes, autowrap text with matchers, render emoji characters, and much more.

Fuzi

1.1k
Stars
149
Forks
Watchers

A fast & lightweight XML & HTML parser in Swift with XPath & CSS support

htmldate

113
Stars
27
Forks
Watchers

Fast and robust date extraction from web pages, with Python or on the command-line

jusText

688
Stars
78
Forks
Watchers

Heuristic based boilerplate removal tool

parse5

3.6k
Stars
231
Forks
Watchers

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

goquery

13.6k
Stars
910
Forks
Watchers

A little like that j-thing, only in Go.

breadability

203
Stars
26
Forks
Watchers

Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)

scala-scraper

711
Stars
106
Forks
Watchers

A Scala library for scraping content from HTML pages

XML-Parser

17
Stars
8
Forks
Watchers

A Node.js XML DOM, Parser & Stringifier.

HTMLp

31
Stars
14
Forks
Watchers

Delphi Dom HTML Parser and Converter. Fork (not from the original author): https://sourceforge.net/projects/htmlp/