Ability to only scan sitemap.xml urls
I would like the ability to only scan urls inside the sitemap. This is to verify that the links that are generated there work. This could work for example when I add a link to https://example.com/sitemap.xml url, it would check all the urls on the page. Currently it only checks that the sitemap loads.
I have just tested it out using the source code. Running ./crawler --url=https://my-site.com/sitemap.xml did crawl all urls found within sitemap. What version are you using?
I was using the latest mac arm64 version of the GUI
and as an example I used https://crawler.siteone.io/sitemap.xml
And from the report we can verify that it only visited the sitemap.xml page and not any of the urls within the sitemap.
I used the default settings:
GUI application may use different version of core application (this one). Looking at link you sent me, GUI uses 1.0.8, while current latest release is 1.0.9 (and source code can be few steps in front too).
Looking at release logs, I can see Crawl from Sitemap: You can now provide a URL to a sitemap.xml or sitemap index file directly to the --url parameter to crawl all listed URLs.. Not sure if this implements whole sitemap functionality, but it may be the right thing to suspect.