wail
wail copied to clipboard
Allow restriction of crawls to a single domain
This has been requested a few times but there is currently no way to do this in the WAIL interface, most recently by Beaudry Allen, Digital Archivist at Villanova.
Q: What needs to be included in a Heritrix crawl job to restrict a crawl to a single domain?
Related: #350
Adding the following to a crawl configuration should accomplish this:
<bean class="org.archive.modules.deciderules.surt.OnDomainsDecideRule">
<property name="decision" value="ACCEPT"/>
</bean>