mwoffliner icon indicating copy to clipboard operation
mwoffliner copied to clipboard

Handle properly articles list with redirect with local anchors

Open mgautierfr opened this issue 5 years ago • 15 comments

Article http://library.kiwix.org/wikipedia_en_medicine_nodet_2019-08/A/Retina and http://library.kiwix.org/wikipedia_en_medicine_nodet_2019-08/A/Lipemia_retinalis are very close.

They have the same title and the content is the same. There are some difference in the html. (Especially a <link href="../-/s/css_modules/mediawiki.action.view.redirectPage.css" rel="stylesheet" type="text/css" class=""> in A/Lipemia_retinalis.

mgautierfr avatar Aug 11 '19 17:08 mgautierfr

@ISNIT0 "Lipemia retinalis" should be a redirect to "Retina"

kelson42 avatar Aug 11 '19 17:08 kelson42

@ISNIT0 The problem here is that we don't have a simple redirect from "Lipemia retinalis" should be a redirect to "Retina", we have a redirect to a specific paragraph"Retina#Diseases and disorders". This is not possible to do with the built-in ZIM redirect system. This should be done with a normal HTML page redirect.

kelson42 avatar Aug 24 '19 17:08 kelson42

@kelson42 Do you mean the problem is that the articleList contains redirects, and MWO is not resolving them?

ISNIT0 avatar Aug 27 '19 10:08 ISNIT0

@ISNIT0 Yes... I confirm it is in https://ftp.nluug.nl/pub/kiwix/wp1/enwiki_2019-08/customs/medicine. You mean it might be a duplicate of #889?

kelson42 avatar Aug 27 '19 10:08 kelson42

I think it is, yes

ISNIT0 avatar Aug 27 '19 10:08 ISNIT0

@ISNIT0 It is, but the problem is a bit more complex here because of the hash at the end of the URL... But I would be agree to handle it together in 2.0 if this is what you prefer. This is indeed really similar.

kelson42 avatar Aug 27 '19 10:08 kelson42

Surely that's fine? The reader serves the content from the redirect, and the hash is preserved? I have a fix nearly ready

ISNIT0 avatar Aug 27 '19 10:08 ISNIT0

For information, zimwriterfs parse the content of the html to detect redirect and create a redirect article. (See https://github.com/openzim/zimwriterfs/blob/master/src/article.cpp#L143-L159)

mgautierfr avatar Sep 02 '19 13:09 mgautierfr

This issue should have been closed by the PR above. I will close for now, we can re-open if it's not fixed :)

ISNIT0 avatar Sep 02 '19 15:09 ISNIT0

@ISNIT0 Thx for the fix, I just reopen it, because I want to verify this by myself.

kelson42 avatar Sep 02 '19 15:09 kelson42

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Nov 01 '19 15:11 stale[bot]

The problem here is how we create the list of articles for the medicine selection, the list is full of redirects.

kelson42 avatar Mar 17 '20 18:03 kelson42

This issue was moved by kelson42 to openzim/wp1_selection_tools#27.

ghost avatar Mar 17 '20 18:03 ghost

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 11 '20 09:06 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Mar 20 '21 00:03 stale[bot]