New ZIM: Mankier.com
Please use the following format for a ZIM creation request…
- Website URL: https://www.mankier.com/
- License: Copyrighted | CC-by-sa | Public domain | ...
- Desired ZIM Title: ManKier
- Desired ZIM Description: Linux man pages
- Desired ZIM Icon –png (URL or attach one): https://www.mankier.com/img/kier-sq.png
- Language (ISO 639-3): eng
- Desired Main Page (homepage): n/a
- Is this a MediaWiki?: yes | no
- Articles List URL (mediawiki): n/a
I am sorry I don't know if it's possible, mankier.com is a need for developers.
This is possible and a good idea.
Requested https://farm.openzim.org/recipes/mankier
Succeeded.
@RavanJAltaie Thank you for your effort and the info. I checked out that ManKier_2022-12.zim. Unfortunately it's only 300KB file and it's not working.
Yeah I confirm it only grabbed the first page: https://dev.library.kiwix.org/viewer#mankier_2022-12/A/www.mankier.com/
This cannot work with Zimit, the website relies on a web API. I would tag this as "Scraper needed" at least, or decide we will never ZIM this (but the need since makes sense, so we should find an alternative).
I have some doubts regarding Licensing given the fact that code seems to be closed-source.
I've pinged the website owner to ask for clarification.
We got permission (see https://kiwix.freshdesk.com/a/tickets/70652). Anything they could do to help?
Super cool!
It is unfortunately not possible to use Zimit scraper because we do not have the ability to scrape the database and API service which are returning responses to search requests about a man page.
So I'm certain they can help if they want to. At least we can ask them how they would recommend to create an offline version of their website.
Would they be open to share the database with us so that we can write a custom scraper on-top of this database? Would they be open to share the source code of their website (rendering engine seems to be open-sourced, but not the rest of the website) so that can leverage this to build the scraper more quickly? Would they be open to contribute to this custom scraper effort: they can maybe easily adapt their website to become a "static-website" version which is not using any API or database, just plain (JSON) files, so that we can quickly create the scraper on-top of this static website?
Details could be discussed in a live meeting if they have interest in such a project and/or directly in this issue.
Hi Benoit,
There is an API and an underlying DB, used for the search and by some third parties... my assumption was you can ignore this if the goal is to package the content of the man pages which is static HTML, and exclude the search input box in Kiwix.
To get a list of all the pages I would suggest starting in the sections as I mentioned below. You can see how many pages there are per section: https://www.mankier.com/stats
Cheers, Jackson
Recipe reconfigured (I also altered a bit the title and description for more precision) and requested the task: https://farm.openzim.org/pipeline/d31651c5-0ffe-4492-a04b-3298a4c39980
Nota: excluding the search box is not straightforward with custom CSS, at least I failed to find proper CSS selector, let's live with it for a first version, we can fix that later if first ZIM is mostly OK
ZIM is ready and mostly OK: https://dev.library.kiwix.org/viewer#www.mankier.com_en_all_2024-06
There is just one big problem on https://dev.library.kiwix.org/viewer#www.mankier.com_en_all_2024-06/www.mankier.com/ page which is completely broken, I'll open an upstream issue
Nice. I couldn't find the problematic page you mentioned. How does one get there?
Click on the "Home" link
Upstream issue has been fixed. New ZIM is ready in dev, I've added a custom CSS to hide adds which are not particularly appealing / relevant once offline. Please review and move to prod if OK for you.
@Popolechien can you please review dev file: https://dev.library.kiwix.org/#lang=&q=mankier
LGTM, ready for Prod.
File published to prod: https://library.kiwix.org/#lang=eng&q=mankier
Recipe set to quartely update.