Alex Osborne
Alex Osborne
I can confirm such an option is not currently implemented. I believe the original developers intentionally left it out as Heritrix enables trivial remote code execution and so they wanted...
As this is about rewriting this is likely an issue with the (closed-source) Wayback replay software not with the Heritrix web crawler.
OK by me. Although this does include the removal of the hbase module which someone objected to last year (see #313).
> e.g. logToFile property not documented Looks like the logToFile property exists (1) on DecideRuleSequence and (2) on everything inheriting from Scoper. 1. I've added DecideRuleSequence to the bean reference...
Just noting that if anyone would like to see a Dockerfile merged please submit it as a pull request and include the documentation/examples you feel appropriate. I'm willing merge it...
All I had in mind was a a pull request that adds the Dockerfile itself and maybe a section named something like 'Running Heritrix under Docker' with some brief usage...
Thanks. That looks great. I've merged it and pushed the main and contrib images to iipc/heritrix. I had intended to automate this with the autobuilder but it seems the free...
From the extracted links it seems to be a redirect not a 204.
Hi Mikel, The install guide you linked makes the assumption the user would be installing the precompiled application not building it from source. The step you seem to be missing...
We probably need to make all the extractors use the outlink helper methods in the Extractor base classes consistently as there's a number of them that call `curi.getOutlinks().add(link)` directly. Then...