YapDatabase icon indicating copy to clipboard operation
YapDatabase copied to clipboard

Is it possible to customise the tokeniser in Full Text Search??

Open tonychan818 opened this issue 8 years ago • 2 comments

in YapDatabase's fts, when it handle a sentenct: how are you? the index become:["how","are","you"] so we can search the target even I type any one of them

but in Chinese 你好嗎 the index become ["你好嗎"] so I can only search it if I type the first word '你' is there anyway to customise it so that it can index like ["你","好","嗎"]? separate them one by one.

tonychan818 avatar Apr 01 '16 10:04 tonychan818

You can specify a custom tokenizer by doing something like this:

[[YapDatabaseFullTextSearch alloc] initWithColumnNames:properties options:@{ @"tokenize": @"unicode61 \"tokenchars=!#$%&?\"" } handler:handler versionTag:@"1"];

You can replace unicode61 \"tokenchars=!#$%&?\" with your own argument to pass to sqlite to tokenize things how you need.

ksuther avatar Apr 04 '16 23:04 ksuther

I try set following options,but the database creation cannot succeed. How to Custom Tokenizers like official. Snip20200522_3

coder-glzhu avatar May 22 '20 05:05 coder-glzhu