newspaper4k icon indicating copy to clipboard operation
newspaper4k copied to clipboard

Too many authors on Techcrunch...

Open AndyTheFactory opened this issue 2 years ago • 1 comments

Issue by stefano-bragaglia Wed Sep 6 10:35:26 2017 Originally opened as https://github.com/codelucas/newspaper/issues/434


I'm just reporting this issue I've experienced when reading the following page with this library: https://techcrunch.com/2017/09/05/moleskines-next-paper-planner-will-automatically-sync-with-google-calendar-and-apples-ical/

Instead of finding Matt Burns as the only author of this page, several other names are found (Sarah Buhr, Devin Coldewey, Darrell Etherington, Samantha Stein, Tito, Khaled, Steve O'Hear, Jon Russell, Mjburnsy and Matt Burns). After inspecting the HTML of the page, I've noticed that they come from .trending_by_line elements which are not visible on the page but still there... I suspect that this part might change over time, so that if people revisits the same page another time the set of authors will probably be different than the one before... In any case this part shouldn't be picked up as main part of the article...

I don't know how to fix this by myself, but happy to help!

AndyTheFactory avatar Oct 24 '23 10:10 AndyTheFactory

Comment by startupflux Sat Nov 18 10:28:50 2017


I see the same issue and not sure how to fix this. newspaper is picking up authors from the 'Read more' section on the page.

AndyTheFactory avatar Oct 24 '23 10:10 AndyTheFactory