zim-requests New ZIM: Mankier.com

Please use the following format for a ZIM creation request…

Website URL: https://www.mankier.com/
License: Copyrighted | CC-by-sa | Public domain | ...
Desired ZIM Title: ManKier
Desired ZIM Description: Linux man pages
Desired ZIM Icon –png (URL or attach one): https://www.mankier.com/img/kier-sq.png
Language (ISO 639-3): eng
Desired Main Page (homepage): n/a
Is this a MediaWiki?: yes | no
Articles List URL (mediawiki): n/a

I am sorry I don't know if it's possible, mankier.com is a need for developers.

Jun 24 '19 10:06 trappedinspacetime

This is possible and a good idea.

Apr 30 '20 07:04 kelson42

Requested https://farm.openzim.org/recipes/mankier

Dec 03 '22 20:12 RavanJAltaie

Succeeded.

Dec 08 '22 20:12 RavanJAltaie

@RavanJAltaie Thank you for your effort and the info. I checked out that ManKier_2022-12.zim. Unfortunately it's only 300KB file and it's not working.

Dec 09 '22 08:12 trappedinspacetime

Yeah I confirm it only grabbed the first page: https://dev.library.kiwix.org/viewer#mankier_2022-12/A/www.mankier.com/

Dec 09 '22 09:12 Popolechien

This cannot work with Zimit, the website relies on a web API. I would tag this as "Scraper needed" at least, or decide we will never ZIM this (but the need since makes sense, so we should find an alternative).

I have some doubts regarding Licensing given the fact that code seems to be closed-source.

Jun 13 '24 13:06 benoit74

I've pinged the website owner to ask for clarification.

Jun 13 '24 13:06 Popolechien

We got permission (see https://kiwix.freshdesk.com/a/tickets/70652). Anything they could do to help?

Jun 14 '24 06:06 Popolechien

Super cool!

It is unfortunately not possible to use Zimit scraper because we do not have the ability to scrape the database and API service which are returning responses to search requests about a man page.

So I'm certain they can help if they want to. At least we can ask them how they would recommend to create an offline version of their website.

Would they be open to share the database with us so that we can write a custom scraper on-top of this database? Would they be open to share the source code of their website (rendering engine seems to be open-sourced, but not the rest of the website) so that can leverage this to build the scraper more quickly? Would they be open to contribute to this custom scraper effort: they can maybe easily adapt their website to become a "static-website" version which is not using any API or database, just plain (JSON) files, so that we can quickly create the scraper on-top of this static website?

Details could be discussed in a live meeting if they have interest in such a project and/or directly in this issue.

Jun 14 '24 07:06 benoit74

Hi Benoit,

There is an API and an underlying DB, used for the search and by some third parties... my assumption was you can ignore this if the goal is to package the content of the man pages which is static HTML, and exclude the search input box in Kiwix.

To get a list of all the pages I would suggest starting in the sections as I mentioned below. You can see how many pages there are per section: https://www.mankier.com/stats

Cheers, Jackson

Recipe reconfigured (I also altered a bit the title and description for more precision) and requested the task: https://farm.openzim.org/pipeline/d31651c5-0ffe-4492-a04b-3298a4c39980

Jun 17 '24 19:06 benoit74

Nota: excluding the search box is not straightforward with custom CSS, at least I failed to find proper CSS selector, let's live with it for a first version, we can fix that later if first ZIM is mostly OK

Jun 17 '24 19:06 benoit74

ZIM is ready and mostly OK: https://dev.library.kiwix.org/viewer#www.mankier.com_en_all_2024-06

There is just one big problem on https://dev.library.kiwix.org/viewer#www.mankier.com_en_all_2024-06/www.mankier.com/ page which is completely broken, I'll open an upstream issue

Jun 21 '24 06:06 benoit74

Nice. I couldn't find the problematic page you mentioned. How does one get there?

Jun 21 '24 08:06 Popolechien

Click on the "Home" link

Jun 21 '24 09:06 benoit74

Upstream issue has been fixed. New ZIM is ready in dev, I've added a custom CSS to hide adds which are not particularly appealing / relevant once offline. Please review and move to prod if OK for you.

Sep 14 '24 06:09 benoit74

@Popolechien can you please review dev file: https://dev.library.kiwix.org/#lang=&q=mankier

Nov 02 '24 16:11 benoit74

LGTM, ready for Prod.

Nov 03 '24 15:11 Popolechien

File published to prod: https://library.kiwix.org/#lang=eng&q=mankier

Recipe set to quartely update.

Nov 10 '24 07:11 benoit74