parser icon indicating copy to clipboard operation
parser copied to clipboard

incomplete content on multiple pages

Open Grienauer opened this issue 1 year ago • 2 comments

Currently on following pages the parser seems to be lost. I don't see any markup problems. maybe the newspapers detect and block the scraper?

https://www.derstandard.at/story/2000145508819/franzoesischer-verfassungsrat-stimmt-umstrittener-pensionsreform-zu there an info is added to the text, that some "software" is blocking stuff and it should be removed

https://kurier.at/wirtschaft/atomausstieg-wie-die-abschaltung-eines-kernkraftwerks-funktioniert/402412829 only one line of text

thx for info. happy to help.

Grienauer avatar Apr 17 '23 21:04 Grienauer

There are multiple mentions in the issues section about header content being removed erroneously. I think this falls into the same problem.

I came here to report the same thing happening on Hackaday.com/blog

Overwatching avatar Apr 20 '23 21:04 Overwatching

And https://www.thetimes.co.uk/ multiple articles, it clips the first one or two paragraphs on every page I'v tried. Kind of useeless in this state.

ctipper avatar Jul 30 '23 20:07 ctipper