Zeno icon indicating copy to clipboard operation
Zeno copied to clipboard

Allow crawl space threshold to be set on CLI, report space avail

Open machawk1 opened this issue 1 year ago • 1 comments

Closes #61

@CorentinB Please note, I'm not proficient in Go, so feedback is welcomed and edit at will.

machawk1 avatar May 15 '24 21:05 machawk1

Thanks for your contribution @machawk1, to be noted: WARC writing is async. The WARC writing queue is displayed when using Zeno with --live-stats, one edge case that is not handled by my code currently is when disk space is too low so the crawl pause, but at the same time there is enough in the WARC writing queue that the remaining space can be eaten up entirely.

Because the WARC writing isn't paused by this pause mechanism here. I mean the WARC writing that was already in the WARC writing queue. I hope it's clear.

Not saying you have to do anything about it here, and solving this would require some additional synchronization between Zeno and the underlying warc library, but I thought it was worth mentioning. :)

CorentinB avatar May 16 '24 08:05 CorentinB

Thanks! Really sorry for the delay here..

CorentinB avatar Jul 08 '24 19:07 CorentinB