zimit
zimit copied to clipboard
How to ZIM websites with millions of pages
ZIMing a website with millions of page is a problem.
Some folks achieved to do that with significant manual interventions but stopping the crawler regularly and restarting it, so that browser does not hang up (it tends to hang up after few days of activity).
But it does not allows to:
- restart from a previous ZIM which completed
- restart from a previous Crawl which completed
And it is not automated inside zimit itself, making it impossible to do in the Zimfarm for now (could be considered a Zimfarm issue I don't know).
Sample zim-requests:
- https://github.com/openzim/zim-requests/issues/496
- https://github.com/openzim/zim-requests/issues/1172
- https://github.com/openzim/zim-requests/issues/1057
- https://github.com/openzim/zim-requests/issues/1081
- probably https://github.com/openzim/zim-requests/issues/1079