Readability4J icon indicating copy to clipboard operation
Readability4J copied to clipboard

A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.

Results 15 Readability4J issues
Sort by recently updated
recently updated
newest added

Hello, mozilla's readbility filters out `` tags before processing the html further, as can be seen in https://github.com/mozilla/readability/blob/master/Readability.js#L633. Readbility4J however does not do this https://github.com/dankito/Readability4J/blob/master/src/main/kotlin/net/dankito/readability4j/processor/ArticleGrabber.kt#L753 I understood, that this library...

when use getContentWithUtf8Encoding get html value, but get error data. ``` ``` should is ``` ``` version: ``` net.dankito.readability4j readability4j 1.0.4 ```

Port gets removed in URI when running method that resolves resolute URIs to absolute. This results in a broken link for all URIs that does not use default ports (80...

Some of the dependency versions needs to be bumped major versions to avoid vulnerabilities. Looking at a few on maven repository: - Jsoup 1.11.2: 2 direct vulnerabilities and multiple indirect...

Hello, First, I would like to express my appreciation to @dankito and everyone else involved for developing such a useful library as Readability4J. I encountered an issue while parsing content...