browsertrix-crawler
browsertrix-crawler copied to clipboard
Prediction of remaining time after restart
After restarting a crawl, the prediction of remaining time is extremely short. I've just restarted a SUCHO crawl (using the Docker image provided) at 4294/9576, and after 11 minutes it predicts 13.7 minutes remaining. The prediction was something like 4 hours before hte restart. I think it is doing something like assuming the 4294 were handled during the new run, at a rate of about 390/minute, and using that for the prediction (though the numbers don't work out exactly). Desired behavior: base the prediction of remaining time on the total running time of the original run and the restarted run.