feedparser
feedparser copied to clipboard
Feedparser extracts incorrect links
When I try to parse this rss, entry.link doesn't match that I expect. I think that the Feedparser incorrectly works with <yandex:related> block.
Here is my code and rss content at the time of writing the issue: code_and_rss.tar.gz
I expect to see (this is valid references to latest posts):
- http://paleonews.ru/new/1165-300mlnlet
- http://paleonews.ru/new/1164-mirarce
- http://paleonews.ru/new/1162-chelonoidis-evol
- http://paleonews.ru/new/1160-biggest-living-filters
- http://paleonews.ru/new/1159-razlom
- http://paleonews.ru/new/1158-eggs
- http://paleonews.ru/new/1157-blind-vorombe
- http://paleonews.ru/new/1156-stellerova
- http://paleonews.ru/new/1155-bug-in-birmit
- http://paleonews.ru/new/1154-piranhamesodon
Instead I see:
- https://naked-science.ru/article/sci/v-grand-kanone-nashli-sledy
- https://nplus1.ru/news/2018/11/13/mirarce-eatoni
- http://paleonews.ru/new/1162-chelonoidis-evol
- http://paleonews.ru/index.php
- https://naked-science.ru/article/sci/paleontologi-obnaruzhili-shest-novyh
- https://naked-science.ru/article/sci/poyavlenie-okraski-u-ptichih-yaic
- https://nplus1.ru/news/2018/10/31/blind-Aepyornises
- https://22century.ru/biology-and-biotechnology/71305
- https://42.tut.by/613815
- https://www.sciencemag.org/news/2018/10/piranhalike-teeth-and-torn-fins-reveal-ancient-fish-fight
I use latest version of Feedparser (5.2.1)
Paleonews.ru doesn't exist anymore but you can find rss-sample in the attached archive
For me this issue is no longer important, I have not developed in Python for a long time. So you can just close it if you do not consider it relevant