web-monitoring-processing icon indicating copy to clipboard operation
web-monitoring-processing copied to clipboard

Allow limiting imports to one version per day

Open Mr0grog opened this issue 7 months ago • 0 comments
trafficstars

Some pages get captured a lot by the Internet Archive, and it’s not really necessary or valuable for us to import and track every one of those captures. Now you can set --skip-unchanged day to import at most one version per day (more-or-less; there are some cases where we might wind up importing more).

I’ve been using this when loading historical data for new URLs we track, but am not using it for our regular nightly imports of all URLs.

Mr0grog avatar Apr 09 '25 00:04 Mr0grog