manticoresearch icon indicating copy to clipboard operation
manticoresearch copied to clipboard

The 'exceptions' configuration option might not work properly for Chinese.

Open stcer opened this issue 6 months ago • 1 comments

Bug Description:

➜  ~ cat << 'EOF' > /tmp/exc
泸州 => 泸州
资阳 => 资阳
EOF

mysql -v -P9306 -h0
--------------
drop table if exists t
--------------

--------------
create table t(f text) morphology='icu_chinese' charset_table='cjk' exceptions='/tmp/exc'
--------------

--------------
insert into t values(1, '泸州是四川省的一座历史文化名城,以其浓郁的酒文化而闻名'), (2, '资阳市位于四川东南部,是一个快速发展的现代化城市')
--------------

--------------
select * from t where match('资阳')
--------------

--------------
show meta
--------------

+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| total          | 1     |
| total_found    | 1     |
| total_relation | eq    |
| time           | 0.000 |
| keyword[0]     | 资    |
| docs[0]        | 1     |
| hits[0]        | 1     |
| keyword[1]     | 阳    |
| docs[1]        | 1     |
| hits[1]        | 1     |
+----------------+-------+

--------------
call keywords('资阳', 't')
--------------

+------+-----------+------------+
| qpos | tokenized | normalized |
+------+-----------+------------+
| 1    | 资        | 资         |
| 2    | 阳        | 阳         |
+------+-----------+------------+

Manticore Search Version:

6.3.0

Operating System Version:

centos7

Have you tried the latest development version?

None

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • [ ] Implementation completed
  • [ ] Tests developed
  • [ ] Documentation updated
  • [ ] Documentation reviewed
  • [ ] Changelog updated

stcer avatar Aug 09 '24 01:08 stcer