Make a ZIM of http://cyclowiki.org/wiki/
From @kelson42 on August 26, 2018 14:15
Copied from original issue: openzim/mwoffliner#363
mwoffliner --mwUrl="http://cyclowiki.org" --adminEmail="[email protected]" --localParsoid --verbose
dies with
Executing command : pngquant --verbose --strip --nofs --force --ext=".ag2w2.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.ag2w2.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".2db4l.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.2db4l.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".b7rl0.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.b7rl0.png"; fi
Executing command : pngquant --verbose --strip --nofs --force --ext=".a5s3f.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png" && advdef -q -z -4 -i 5 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.a5s3f.png"; fi
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/video-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/audio-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/document-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icons.png
Saving favicon.png...
Downloading http://cyclowiki.org/w/api.php?action=query&meta=siteinfo&format=json...
Executing command : gifsicle --verbose --colors 64 -O3 "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif" -o "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif" && if [ $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif") -lt $(stat -c%s "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif") ]; then mv "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif" "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif"; else rm "/srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.tz4ir.gif"; fi
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/user-icon.png
Successfuly optimized /srv/kiwix-maintenance/mwoffliner/tmp/cyclowiki_ru_all_2018-09/s/watch-icon-loading.gif
TypeError [ERR_INVALID_ARG_TYPE]: The "url" argument must be of type string. Received type undefined
at Url.parse (url.js:150:11)
at Object.urlParse [as parse] (url.js:144:13)
at /usr/local/lib/node_modules/mwoffliner/lib/mwoffliner.lib.js:2083:29
at async.retry (/usr/local/lib/node_modules/mwoffliner/lib/Downloader.js:176:7)
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:676:51
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:726:13
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:52:16
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:264:21
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:44:16
at /usr/local/lib/node_modules/mwoffliner/node_modules/async/lib/async.js:723:17
From @ISNIT0 on September 18, 2018 9:39
No logoUrl is returned by the api for this site (http://cyclowiki.org/w/api.php?action=query&meta=siteinfo&format=json)
What should be done in this situation? Use a placeholder?
@ISNIT0 It should stop with a proper error msg inviting to use --customZimFavicon
Impacted by https://github.com/openzim/mwoffliner/issues/387
Done at https://farm.openzim.org/recipes/cyclowiki
Still not published
Recipe moved to https://farm.openzim.org/recipes/cyclowiki.org_rus_all
We are blocked by Cloudflare, no matter which User-Agent / Referer we pass, Cloudflare still blocks us on first call to sanitize_mwUrl (i.e. probably first call ever) on most workers.
So close as wontfix or do we wait for a decision on mwoffliner#2134?
#2134 might help, so probably worth to wait
https://github.com/openzim/mwoffliner/issues/2134 is fixed, but we are now blocked by https://github.com/openzim/mwoffliner/issues/2091
New attempt with 1.15 at https://farm.openzim.org/recipes/cyclowiki.org_ru_all
This is clearly blocked by cloudflare. Closing issue as we have no way currently to work around this blockage. Should someone have an idea about how to workaround this, feel free to ask for reopen.
Last run is moving fine: https://farm.openzim.org/pipeline/ab1c0168-ca92-440d-be10-7c83de35231d
I'm pretty sure https://github.com/openzim/mwoffliner/pull/2395 was the change required.
Last attempt unfortunately died during the maxi computation. Reason is not that clera to me why... but I have relaunched with dev and slow-down the speed. See https://farm.openzim.org/pipeline/cd2d2a5c-5d29-4b44-9eec-281af7e0cdd6
Last attempt was successful, see https://farm.openzim.org/recipes/cyclowiki.org_ru_all
@benoit74 @Popolechien ZIM files are available to final review at https://dev.library.kiwix.org/#lang=&q=cyclowiki. Are we good to go to production?
LGTM but I'm very very surprised by the size discrepancy between mini/nopic and maxi (905/914MB vs. 25GB). Should we also maybe do without the mini version?
Other than that, spotted this at Древнегреческий язык
Source is here and shows those are two embedded youtube videos.
Not a blocker IMHO, but worth flagging. Should I open a separate issue?
Should I open a separate issue?
Yes, please do in openzim/mwoffliner, worth to fix (and especially decide how to fix this)
I'm very very surprised by the size discrepancy between mini/nopic and maxi (905/914MB vs. 25GB). Should we also maybe do without the mini version?
I'm sorry, I don't get it, what is the problem of having a very small mini ZIM if images on the wiki are very big / numerous and lead to a very big maxi?
Yeah that what poorly phrased as these are two separate questions:
- Seeing how close in size
miniandnopicare, I do not see the value of generating amini(the distinct flavours/size are confusing to users, which is compounded by the fact that filtering is not really optimal) - I am suprised at the size discrepancy (on WP for instance
nopicwould be half the size ofmaxinot 4% like here so I was wondering if this was a bug or feature (ie, is the wiki so media-rich that removing them makes a world of a difference).
We are already only generating nopic and maxi, mini is already not generated AFAIK (or I miss something).
Regarding size discrepancy, I do not have sufficient tooling/time to investigate details ATM.