lunr-languages Error when using lunr.zh.js 'nodejieba.cut is not a function'

Error when using lunr.zh.js 'nodejieba.cut is not a function'

Open RogerBlasco opened this issue 2 years ago • 2 comments

Error is: Uncaught TypeError: nodejieba.cut is not a function at lunr.zh.tokenizer (lunr.zh.js:98:1) at lunr.Builder.add (lunr.js:2479:1) at lunr.Builder. (xxx) at lunr (lunr.js:53:1) at XMLHttpRequest. (xxx)

The line in the lunr.zh.tokenizer is:

nodejieba.cut(str, true).forEach(function(seg) {
        tokens = tokens.concat(seg.split(' '))
      })

I'm afraid I'm not quite good enough at this time to dive in and resolve, but if someone could assist in reviewing or letting me know what exactly I would need to do to handle, I would much appreciate...

Nov 10 '22 04:11 RogerBlasco

Are you trying to run this in a browser environment? The zh tokenizer requires node to run, because it uses C++ addons (node jieba). I opened an issue (#90) where I talk about how you can use the built-in Intl.Segmenter instead to segment Chinese (and other) languages quite easily. Here is a fork where I switched the zh module to using Intl.Segmenter.

Feb 23 '23 22:02 knubie

@knubie this is awesome and solved the issue!

Oct 09 '23 23:10 greylantern

lunr-languages lunr-languages copied to clipboard

Error when using lunr.zh.js 'nodejieba.cut is not a function'

lunr-languages
lunr-languages copied to clipboard