Andrei Paraschiv
Andrei Paraschiv
**Comment by [codelucas](https://github.com/codelucas)** _Sat Apr 7 17:42:49 2018_ ---- thanks for filing @kalhan123. can you please post different links, with the expected fulltext extract and the actual fulltext extract?
**Comment by [kalhan123](https://github.com/kalhan123)** _Mon Apr 9 10:20:56 2018_ ---- **sample text req-**Walmart completed a thorough due diligence process on e-commerce firm Flipkart this week, two sources said, as the US...
**Comment by [Neileruaa](https://github.com/Neileruaa)** _Sun Mar 21 13:31:00 2021_ ---- Hey ! Did you find a solution to your problem ? I am also looking for a solution...
**Comment by [panditarevolution](https://github.com/panditarevolution)** _Fri Nov 9 20:01:38 2018_ ---- I found the same issue with articles on quantamagezine and several other news sites. This was also via the [demo app](http://newspaper-demo.herokuapp.com/articles/show?url_to_clean=https%3A%2F%2Fwww.quantamagazine.org%2Fneutral-theory-of-evolution-challenged-by-evidence-for-dna-selection-20181108%2F)
**Comment by [heisenburger](https://github.com/heisenburger)** _Wed Mar 27 02:57:31 2019_ ---- Me too for The New Yorker and some other sites
**Comment by [manoadamro](https://github.com/manoadamro)** _Wed Apr 3 17:04:57 2019_ ---- for many UK news outlets, you will only get the first paragraph
**Comment by [dividor](https://github.com/dividor)** _Sun Apr 28 18:01:38 2019_ ---- It's a pretty fundamental problem affecting major news outlets, any update please on whether this might be resolved?
**Comment by [dividor](https://github.com/dividor)** _Sat May 4 14:40:50 2019_ ---- Not sure if this is something that might be useful or not, but in order to try and programmatically identify partial...
**Comment by [mmaybeno](https://github.com/mmaybeno)** _Thu Jan 2 20:46:00 2020_ ---- If you inspect the html of the NYT and other sites where the parsing is not working as expected, they are...
**Comment by [mmaybeno](https://github.com/mmaybeno)** _Thu Jan 2 21:35:10 2020_ ---- Upon further inspection, it has something to do with the `calculate_best_node` function in the extractor. For a given article, there is...