mwoffliner icon indicating copy to clipboard operation
mwoffliner copied to clipboard

Titles of chinese article are bad

Open benoit74 opened this issue 8 months ago • 4 comments

Another problem based on the same test ZIM file made for the comment above, is that the Article title are not displayed properly in fulltext search page results

Image

Originally posted by @kelson42 in #1576

benoit74 avatar Mar 31 '25 07:03 benoit74

Might be related / the root cause of https://github.com/openzim/mwoffliner/issues/2199 ; see there for command to run as well

benoit74 avatar Mar 31 '25 07:03 benoit74

This is a bug of Wikimedia mobile-html REST endpoint, I've opened a ticket there, nothing to do in mwoffliner: https://phabricator.wikimedia.org/T390705

Title is fine when using WikimediaDesktop renderer:

Image

benoit74 avatar Apr 01 '25 12:04 benoit74

@benoit74 Thank you for the analysis, in this specific case, I would like that we consider to build a workaround for the time being. Is that reasonably feasible?

kelson42 avatar Apr 02 '25 11:04 kelson42

Why? This is not really impacting the scraper for now (only when using WikimediaMobile, which we do not expect to use for now) and this is clearly a WMF bug (even a browser opening the page does not achieve to render the page title).

But, yes, doable of course

benoit74 avatar Apr 03 '25 09:04 benoit74