pagefind icon indicating copy to clipboard operation
pagefind copied to clipboard

Pagefind doesn't support stemming for the language zh-cn?

Open charliez0 opened this issue 7 months ago • 3 comments

I've use npx pagefind but don't support Chinese, why?

charliez0 avatar Aug 25 '25 12:08 charliez0

Hmm, Pagefind's npx distribution should support Chinese. What are you seeing when it doesn't?

bglw avatar Aug 27 '25 02:08 bglw

Running Pagefind v1.3.0 (Extended)
Running from: "D:\\cygwin64\\home\\charliez0\\module\\modern"
Source:       "dist"
Output:       "dist\\pagefind"

[Walking source directory]
Found 134 files matching **/*.{html}

[Parsing files]
Found a data-pagefind-body element on the site.
↳ Ignoring pages without this tag.

[Reading languages]
Discovered 1 language: zh-cn

[Building search indexes]
Total:
  Indexed 1 language
  Indexed 14 pages
  Indexed 1836 words
  Indexed 0 filters
  Indexed 0 sorts
Note: Pagefind doesn't support stemming for the language zh-cn.
Search will still work, but will not match across root words.
Note: Pagefind doesn't support stemming for the language zh-cn.
Search will still work, but will not match across root words.

Finished in 0.463 seconds

charliez0 avatar Aug 27 '25 03:08 charliez0

Ah, yes searching in the browser isn't as good as it could be in languages that don't have reliable whitespace.

Since Pagefind's initial release, the Intl.Segmenter API has become widely supported in browsers, so this is something that could be improved in a future release 🙂

bglw avatar Aug 27 '25 04:08 bglw