mdBook chinese search support

chinese search support.

set language to zh, then you can search chinese word.

[book]
authors = ["Zhou Yue"]
language = "zh"
multilingual = false
src = "src"
title = "trial"

Apr 03 '21 20:04 aowji

@ehuss

Why is this pr not merged and there is no reply? There are many people in the Chinese community who use mdbook and can only search in English at the moment, so why hasn't this pr been merged in 10 months? Is there any reason?

Jan 15 '22 14:01 ZhangHanDong

Sorry, I don't have time to review all PRs.

Just a quick scan of this PR, there are a number of issues:

The inclusion of the extra stuff needs to be conditional. For books not using chinese, it is a significant extra cost. This includes building elasticlunr, which IIRC is a large increase, and the inclusion of extra javascript.
This PR includes formatting changes unrelated to the PR (such as indentation changes). Those should usually be separate.
It's not clear why the extra javascript is needed. Without some sort of explanation in the PR description, it requires reverse-engineering the code, which takes a lot of time.

Jan 15 '22 16:01 ehuss

@ehuss Thanks for the reply, know the reason can be better to improve it.

Jan 16 '22 05:01 ZhangHanDong

I tried it, and it seems that extra javascript should be included as additional-js. If so, then maybe we should treat the extra javascript files in another way?

[output.html]
additional-js = [
    "lunr.zh.js",
    "lunr.stemmer.support.js",
]

Apr 15 '22 16:04 Sciroccogti

Please add language = 'zh-CN', 'zh-HK', 'zh-TW' these aliases

May 13 '22 08:05 Akagi201

chinese search support.

set language to zh, then you can search chinese word.
[book]
authors = ["Zhou Yue"]
language = "zh"
multilingual = false
src = "src"
title = "trial"

hi，为什么我的不能搜索中文?我也设置了 "zh"

Jun 05 '22 18:06 xuscode

Any progress to this PR? Really need Non-English Support.

Aug 21 '22 00:08 futurist

Any progress to this PR? Really need Non-English Support.

I think so too.

Dec 24 '22 22:12 tasuren

Is there any progress on this pr?

Jan 09 '23 09:01 TinySnow

It's not clear why the extra javascript is needed. Without some sort of explanation in the PR description, it requires reverse-engineering the code, which takes a lot of time.

javascript From https://github.com/MihaiValentin/lunr-languages, a better option would be to just use "lunr-languages" and no longer use "elasticlunr". Until then, it may be necessary to wait for some progress to be made on #5 .

A better option might be https://github.com/ajitid/fzf-for-js, a local search engine that supports Unicode, see https://github.com/ajitid/fzf-for-js/issues/112 for Unicode support.

May 14 '23 10:05 wc7086

Chinese is usually troublesome because there are no word breaks, meaning that the indexing must be done via either a heuristic to break up words or a natural language processor that understands the text and can break words.

Otherwise you'd need to index all individual characters as well as all pairwise combinations at least