mwoffliner
mwoffliner copied to clipboard
Handle properly articles list with redirect with local anchors
Article http://library.kiwix.org/wikipedia_en_medicine_nodet_2019-08/A/Retina and http://library.kiwix.org/wikipedia_en_medicine_nodet_2019-08/A/Lipemia_retinalis are very close.
They have the same title and the content is the same.
There are some difference in the html. (Especially a <link href="../-/s/css_modules/mediawiki.action.view.redirectPage.css" rel="stylesheet" type="text/css" class="">
in A/Lipemia_retinalis
.
@ISNIT0 "Lipemia retinalis" should be a redirect to "Retina"
@ISNIT0 The problem here is that we don't have a simple redirect from "Lipemia retinalis" should be a redirect to "Retina", we have a redirect to a specific paragraph"Retina#Diseases and disorders". This is not possible to do with the built-in ZIM redirect system. This should be done with a normal HTML page redirect.
@kelson42 Do you mean the problem is that the articleList contains redirects, and MWO is not resolving them?
@ISNIT0 Yes... I confirm it is in https://ftp.nluug.nl/pub/kiwix/wp1/enwiki_2019-08/customs/medicine. You mean it might be a duplicate of #889?
I think it is, yes
@ISNIT0 It is, but the problem is a bit more complex here because of the hash at the end of the URL... But I would be agree to handle it together in 2.0 if this is what you prefer. This is indeed really similar.
Surely that's fine? The reader serves the content from the redirect, and the hash is preserved? I have a fix nearly ready
For information, zimwriterfs parse the content of the html to detect redirect and create a redirect article. (See https://github.com/openzim/zimwriterfs/blob/master/src/article.cpp#L143-L159)
This issue should have been closed by the PR above. I will close for now, we can re-open if it's not fixed :)
@ISNIT0 Thx for the fix, I just reopen it, because I want to verify this by myself.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
The problem here is how we create the list of articles for the medicine selection, the list is full of redirects.
This issue was moved by kelson42 to openzim/wp1_selection_tools#27.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.