zim-requests icon indicating copy to clipboard operation
zim-requests copied to clipboard

Make a ZIM of http://cyclowiki.org/wiki/

Open kelson42 opened this issue 7 years ago • 21 comments

From @kelson42 on August 26, 2018 14:15

Copied from original issue: openzim/mwoffliner#363

kelson42 avatar Sep 18 '18 09:09 kelson42

mwoffliner --mwUrl="http://cyclowiki.org" --adminEmail="[email protected]" --localParsoid --verbose

dies with

Executing command : pngquant --verbose --strip --nofs --force --ext=".ag2w2.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".2db4l.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".b7rl0.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".a5s3f.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png"; fi
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icons.png
Saving favicon.png...
Downloading http://cyclowiki.org/w/api.php?action=query&meta=siteinfo&format=json...
Executing command : gifsicle --verbose --colors 64 -O3 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif" -o "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif"; fi
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif
TypeError [ERR_INVALID_ARG_TYPE]: The "url" argument must be of type string. Received type undefined
    at Url.parse (url.js:150:11)
    at Object.urlParse [as parse] (url.js:144:13)
    at /usr/local/lib/node_modules/mwoffliner/lib/mwoffliner.lib.js:2083:29
    at async.retry (/usr/local/lib/node_modules/mwoffliner/lib/Downloader.js:176:7)
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:676:51
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:726:13
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:52:16
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:264:21
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:44:16
    at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:723:17

kelson42 avatar Sep 18 '18 09:09 kelson42

From @ISNIT0 on September 18, 2018 9:39

No logoUrl is returned by the api for this site (http://cyclowiki.org/w/api.php?action=query&meta=siteinfo&format=json)

What should be done in this situation? Use a placeholder?

kelson42 avatar Sep 18 '18 09:09 kelson42

@ISNIT0 It should stop with a proper error msg inviting to use --customZimFavicon

kelson42 avatar Sep 18 '18 09:09 kelson42

Impacted by https://github.com/openzim/mwoffliner/issues/387

kelson42 avatar Sep 19 '18 09:09 kelson42

Done at https://farm.openzim.org/recipes/cyclowiki

kelson42 avatar Dec 01 '20 18:12 kelson42

Still not published

benoit74 avatar Nov 10 '24 21:11 benoit74

Recipe moved to https://farm.openzim.org/recipes/cyclowiki.org_rus_all

benoit74 avatar Jan 13 '25 13:01 benoit74

We are blocked by Cloudflare, no matter which User-Agent / Referer we pass, Cloudflare still blocks us on first call to sanitize_mwUrl (i.e. probably first call ever) on most workers.

benoit74 avatar Feb 03 '25 19:02 benoit74

So close as wontfix or do we wait for a decision on mwoffliner#2134?

Popolechien avatar Feb 04 '25 09:02 Popolechien

#2134 might help, so probably worth to wait

benoit74 avatar Feb 04 '25 10:02 benoit74

https://github.com/openzim/mwoffliner/issues/2134 is fixed, but we are now blocked by https://github.com/openzim/mwoffliner/issues/2091

benoit74 avatar Feb 19 '25 17:02 benoit74

New attempt with 1.15 at https://farm.openzim.org/recipes/cyclowiki.org_ru_all

kelson42 avatar May 30 '25 14:05 kelson42

This is clearly blocked by cloudflare. Closing issue as we have no way currently to work around this blockage. Should someone have an idea about how to workaround this, feel free to ask for reopen.

benoit74 avatar Jun 12 '25 15:06 benoit74

Last run is moving fine: https://farm.openzim.org/pipeline/ab1c0168-ca92-440d-be10-7c83de35231d

I'm pretty sure https://github.com/openzim/mwoffliner/pull/2395 was the change required.

benoit74 avatar Jul 01 '25 15:07 benoit74

Last attempt unfortunately died during the maxi computation. Reason is not that clera to me why... but I have relaunched with dev and slow-down the speed. See https://farm.openzim.org/pipeline/cd2d2a5c-5d29-4b44-9eec-281af7e0cdd6

kelson42 avatar Jul 16 '25 09:07 kelson42

Last attempt was successful, see https://farm.openzim.org/recipes/cyclowiki.org_ru_all

@benoit74 @Popolechien ZIM files are available to final review at https://dev.library.kiwix.org/#lang=&q=cyclowiki. Are we good to go to production?

kelson42 avatar Jul 27 '25 10:07 kelson42

LGTM but I'm very very surprised by the size discrepancy between mini/nopic and maxi (905/914MB vs. 25GB). Should we also maybe do without the mini version?

Other than that, spotted this at Древнегреческий язык

Image

Source is here and shows those are two embedded youtube videos.

Not a blocker IMHO, but worth flagging. Should I open a separate issue?

Popolechien avatar Jul 27 '25 13:07 Popolechien

Should I open a separate issue?

Yes, please do in openzim/mwoffliner, worth to fix (and especially decide how to fix this)

benoit74 avatar Aug 18 '25 07:08 benoit74

I'm very very surprised by the size discrepancy between mini/nopic and maxi (905/914MB vs. 25GB). Should we also maybe do without the mini version?

I'm sorry, I don't get it, what is the problem of having a very small mini ZIM if images on the wiki are very big / numerous and lead to a very big maxi?

benoit74 avatar Aug 18 '25 07:08 benoit74

Yeah that what poorly phrased as these are two separate questions:

  • Seeing how close in size mini and nopicare, I do not see the value of generating a mini (the distinct flavours/size are confusing to users, which is compounded by the fact that filtering is not really optimal)
  • I am suprised at the size discrepancy (on WP for instance nopic would be half the size of maxi not 4% like here so I was wondering if this was a bug or feature (ie, is the wiki so media-rich that removing them makes a world of a difference).

Popolechien avatar Aug 18 '25 10:08 Popolechien

We are already only generating nopic and maxi, mini is already not generated AFAIK (or I miss something).

Regarding size discrepancy, I do not have sufficient tooling/time to investigate details ATM.

benoit74 avatar Aug 18 '25 14:08 benoit74