browsertrix-crawler icon indicating copy to clipboard operation
browsertrix-crawler copied to clipboard

wrong disk utilization estimate while crawling on external drive

Open Schoeneh opened this issue 8 months ago • 1 comments

I and others encounter in issues while running the docker-image on an external drive: When the internal disk is nearly full, while the external drive has plenty of space left, the crawler stops with the error "not enough free space on disk left".

The expected behavior is browsertrix-crawler only looking at the directory it is run in, which does not happen.

This issue is also described here:

"it crashed after downloading a few hundred megabytes, saying there is not enough free space on my disk left. I had it write the data to an external hard drive with more than 4 TB of free space, but somehow, it seems to ignore that"
https://chaos.social/@flauschzelle/114675220670649685

Schoeneh avatar Jun 14 '25 16:06 Schoeneh

Thank you @Schoeneh for the bug report. In the meantime if you need an immediate solution, --diskUtilization 0 disables the check.

tw4l avatar Jun 16 '25 12:06 tw4l

We have turned off the disk utilization check by default, (set to --diskUtilization 0), so this shouldn't be happening unless the setting is set explicitly.

ikreymer avatar Jul 04 '25 06:07 ikreymer