fundus icon indicating copy to clipboard operation
fundus copied to clipboard

[Question]: Does Fundus intentionally avoid crawling inner page links for article discovery?

Open MSDuran opened this issue 8 months ago • 1 comments

Question

As far as I see, Fundus currently relies on sources like RSS feeds, newsmaps, and sitemaps to harvest news articles, without automatically scraping inner website links to discover more possible content. Is this an intentional design choice, or a new feature that needs to be added?

MSDuran avatar May 15 '25 15:05 MSDuran

It used to be an intentional design choice in the early development of Fundus with the goal of "politely" crawling the publishers. Yet, we have thought of implementing a spyder that can optionally be used to fall back to links on the website if wanted. As a matter of fact, we have collected some thoughts on that topic in #325. We just haven't gotten around to implementing it yet.

addie9800 avatar May 15 '25 15:05 addie9800