ultimate-sitemap-parser icon indicating copy to clipboard operation
ultimate-sitemap-parser copied to clipboard

Reduce recursivity level for sitemap fetcher

Open pypt opened this issue 5 years ago • 1 comments

10 levels deep is probably too much:

2018-11-26 13:11:19,139 INFO mediawords.util.sitemap.helpers
[162086/MainThread]: Fetching URL
https://www.juiceplus.com/fr/fr/franchise/sitemap.xml...
2018-11-26 13:11:19,428 INFO mediawords.util.sitemap.fetchers
[162086/MainThread]: Parsing sitemap from URL
https://www.juiceplus.com/fr/fr/franchise/sitemap.xml...
2018-11-26 13:11:19,508 INFO mediawords.util.sitemap.fetchers
[162086/MainThread]: Fetching level 8 sitemap from
https://www.juiceplus.com/il/en/franchise/sitemap.xml...
2018-11-26 13:11:19,508 INFO mediawords.util.sitemap.helpers
[162086/MainThread]: Fetching URL
https://www.juiceplus.com/il/en/franchise/sitemap.xml...

pypt avatar Nov 27 '18 10:11 pypt

No. The purpose of a sitemap is to show every single page on the website, lowering the depth would result in an invalid sitemap extraction. I completely disagree that this is a bug.

nubonics avatar Sep 14 '20 06:09 nubonics