zim-tools icon indicating copy to clipboard operation
zim-tools copied to clipboard

zimcheck should check for empty <title> entry

Open kelson42 opened this issue 4 years ago • 3 comments

This is important that HTML front-article have a valid/non-empty <title> entry... or even a non-existing <title> tag! Otherwise the whole Kiwix suggestion system will fail. See for example https://github.com/openzim/ted/issues/125

kelson42 avatar Dec 18 '21 10:12 kelson42

Agree. But we may have false positive. Once the zim is written, the both situations "title is empty" and "title==path" are equivalent and not distinguishable.

As said in https://github.com/openzim/ted/issues/125#issuecomment-999489746 if no entry has a title, we don't have a title index at all. We may check for that first. We can also loop over all the entries in the xapian title index and the front article list and compare the entries. By definition, front articles are put in the front articles list AND indexed in the xapian title index. But if the real title is empty, it is not indexed. So we can detect that something goes wrong at a moment. But it is probably a bit more complex (not necessarily complex, but we have never checked a xapian database before)

mgautierfr avatar Dec 22 '21 11:12 mgautierfr

@mgautierfr Your proposal seems to be an other way to come to the same diagnostic. No opinion for the moment what would be the best approach... But we should better check it because missing titles have a quite strong impact on UX.

kelson42 avatar Dec 22 '21 11:12 kelson42

This ticket is clearly blocked by #331

kelson42 avatar Mar 23 '23 19:03 kelson42