Feedparser extracts incorrect links

Open dimuska139 opened this issue 7 years ago • 1 comments

When I try to parse this rss, entry.link doesn't match that I expect. I think that the Feedparser incorrectly works with <yandex:related> block.

Here is my code and rss content at the time of writing the issue: code_and_rss.tar.gz

I expect to see (this is valid references to latest posts):

http://paleonews.ru/new/1165-300mlnlet
http://paleonews.ru/new/1164-mirarce
http://paleonews.ru/new/1162-chelonoidis-evol
http://paleonews.ru/new/1160-biggest-living-filters
http://paleonews.ru/new/1159-razlom
http://paleonews.ru/new/1158-eggs
http://paleonews.ru/new/1157-blind-vorombe
http://paleonews.ru/new/1156-stellerova
http://paleonews.ru/new/1155-bug-in-birmit
http://paleonews.ru/new/1154-piranhamesodon

Instead I see:

https://naked-science.ru/article/sci/v-grand-kanone-nashli-sledy
https://nplus1.ru/news/2018/11/13/mirarce-eatoni
http://paleonews.ru/new/1162-chelonoidis-evol
http://paleonews.ru/index.php
https://naked-science.ru/article/sci/paleontologi-obnaruzhili-shest-novyh
https://naked-science.ru/article/sci/poyavlenie-okraski-u-ptichih-yaic
https://nplus1.ru/news/2018/10/31/blind-Aepyornises
https://22century.ru/biology-and-biotechnology/71305
https://42.tut.by/613815
https://www.sciencemag.org/news/2018/10/piranhalike-teeth-and-torn-fins-reveal-ancient-fish-fight

I use latest version of Feedparser (5.2.1)

Nov 19 '18 18:11 dimuska139

Paleonews.ru doesn't exist anymore but you can find rss-sample in the attached archive

For me this issue is no longer important, I have not developed in Python for a long time. So you can just close it if you do not consider it relevant

Feb 17 '25 12:02 dimuska139