NReadability icon indicating copy to clipboard operation
NReadability copied to clipboard

Hidden section returned instead of the main article body

Open razvangoga opened this issue 9 years ago • 0 comments

Hi,

I've been using your component for some time with good results, but lately we have encountered more and more cases like this one http://ir.tcfbank.com/file/Index?KeyFile=32068838 where NR decides that the terms and condition body of text (hidden and visible via a popup when you click the "terms and conditions" link at the bottom of the article) is extracted instead of the actual article body. Technically the decision is correct as the "t&c" body of text is larger and more compact than the main article body.

In other cases the (now) omnipresent "this website uses cookies" text is chosen instead of the article on the same grounds.

Do you have any plans to address such issues in the near future ?

For the moment we have resolved it by using an in house modified version of NR where we can tweak the algorithm regex-es on a case by case basis to exclude the irrelevant content.

Thanks and best regards, Razvan Goga

razvangoga avatar Dec 09 '15 10:12 razvangoga