gutenberg icon indicating copy to clipboard operation
gutenberg copied to clipboard

Add support for full-text index in ZIMs with multiple language

Open benoit74 opened this issue 3 months ago • 2 comments

Due to upstream issue in libzim on full-text index with multiple languages, scraper currently displays a warning saying this is not recommended to create full-text index on multi-languages ZIMs.

This warning should be removed once upstream issue is fixed.

benoit74 avatar Oct 16 '25 08:10 benoit74

Hi @benoit74 I was exploring this issue and noticed it depends on an upstream fix in libzim for multi-language full text indexing. I wanted to ask:

  • Is there already an upstream PR or branch in libzim addressing this, or is it still pending?
  • If the upstream fix is not ready yet, should this issue remain on hold for now?

As a thought: would it make sense to keep the warning but add an optional opt-in flag (e.g., an environment variable) so users with a patched libzim can enable multi-language indexing early?

rawadhossain avatar Dec 11 '25 18:12 rawadhossain

This issue is indeed on-hold, waiting for upstream fix. And upstream fix is very hard to do, we do not really know how we wanna handle the situation. Upstream issue is https://github.com/openzim/libzim/issues/734

benoit74 avatar Dec 11 '25 20:12 benoit74