browsertrix-crawler icon indicating copy to clipboard operation
browsertrix-crawler copied to clipboard

Prediction of remaining time after restart

Open pbinkley opened this issue 3 years ago • 0 comments

After restarting a crawl, the prediction of remaining time is extremely short. I've just restarted a SUCHO crawl (using the Docker image provided) at 4294/9576, and after 11 minutes it predicts 13.7 minutes remaining. The prediction was something like 4 hours before hte restart. I think it is doing something like assuming the 4294 were handled during the new run, at a rate of about 390/minute, and using that for the prediction (though the numbers don't work out exactly). Desired behavior: base the prediction of remaining time on the total running time of the original run and the restarted run.

pbinkley avatar Mar 07 '22 22:03 pbinkley