Readability4J icon indicating copy to clipboard operation
Readability4J copied to clipboard

A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.

Results 15 Readability4J issues
Sort by recently updated
recently updated
newest added

https://www.natureworldnews.com/articles/45834/20210427/sumatran-rhinoceros-striving-genetic-diversity-despite-extinction.htm There is a rhino picture at the beginning of the article, but after I parse the html using Readability4JExtended or Readability version 1.0.6, the img tag is gone and...

Characters like äüö are output incorrectly on some websites. In the German language these characters are often used. In English it does not occur and there is not this problem....

get blank content for these url: https://vtm.zive.cz/clanky/robot-atlas-predvadi-dokonaly-parkour-tentokrat-boston-dynamics-pridava-i-nepovedene-zabery/sc-870-a-211811/default.aspx https://vtm.zive.cz/clanky/nove-solarni-letadlo-by-melo-byt-schopne-zustat-ve-vzduchu-90-dni-v-kuse/sc-870-a-211576/default.aspx https://mobilmania.zive.cz/clanky/google-pixel-5a-5g-je-prijemnou-evoluci-dostal-vodeodolnost-a-silnou-baterii/sc-3-a-1352462/default.aspx and so on

The appearances of the blockquote and ul, ol, li elements displayed in the images look extremely strange. I would appreciate it if you could make a better design. ![image](https://user-images.githubusercontent.com/61169988/125096603-fa7fb600-e0dd-11eb-9f35-c55a291a5a39.png) ![image](https://user-images.githubusercontent.com/61169988/125096623-ff446a00-e0dd-11eb-8f99-88cacd50b6c7.png)

The following page: https://netflixtechblog.com/full-cycle-developers-at-netflix-a08c31f83249 has `img` tags which have empty `src` attribute. The `src` is set via javascript upon scroll I think or via `noscript` tags right after the `img`...

Added the isReadarable function from readability.js

I am testing using the HTML from this page: https://www.beatportal.com/features/beatports-definitive-guide-to-techno/ It only seems to return the output from the middle of the page: From: `Mark Ernestus, founder of record shop...

Hi. I can't quite see if you've got a java version of isProbablyReaderable, and how to use it. ``` My code is Readability4J readability4J = new Readability4J(url, html); Article article...

Hello @dankito , thanks for the project. It may sound a silly question as the project is alreaady written in Kotlin. However, I couldn't make it work on Kotlin Gradle...

Example Code. [Screenshot.](https://imgur.com/fNlKXJV) ``` class MainActivity : AppCompatActivity() { override fun onCreate(savedInstanceState: Bundle?) { super.onCreate(savedInstanceState) setContentView(R.layout.activity_main) val webView = findViewById(R.id.webView) webView?.let { it.settings.javaScriptEnabled = true it.webViewClient = MyWebViewClient() } webView.loadUrl("https://en.wikipedia.org/wiki/The_New_York_Times")...