zimit icon indicating copy to clipboard operation
zimit copied to clipboard

creating a zim from a website that host images on https://imageshack.com/

Open kroryan opened this issue 1 year ago • 2 comments

how can i create a zim that includes the pictures on imageshack?

i tried this but it doesnt take the picture from images hack:

sudo docker run -v /media/usb/output:/output --shm-size=1gb ghcr.io/openzim/zimit zimit --url url --name url1 --workers 10 --waitUntil domcontentloaded

is there a way to do it?

kroryan avatar Jul 01 '24 18:07 kroryan

I'm pretty sure it is not easily feasible, imageshack has tons of reason to want to avoid such actions. At least if I'm not mistaken you need to login into imageshack, so you need to pass this login information to Browsertrix crawler which is ran by kiwix. I'll give few hint but it would deserve a very very long tutorial. In Browsertrix crawler, this is done with a browser profiles, see https://crawler.docs.browsertrix.com/user-guide/browser-profiles/ ; once you have a browser profile, you can pass this tar.gz in zimit CLI with the --profile argument I recommend to start with only 1 worker, you can always increase it later once you have a working setup, but more workers also means more likely detection by anti-bot systems.

benoit74 avatar Jul 01 '24 20:07 benoit74

Oh, sorry, I probably misread your situation. You are not crawling imageshack but a website which uses imageshack as image provider? How is that possible, I thought imageshack is quite restrictive on this ...

benoit74 avatar Jul 01 '24 20:07 benoit74