manticoresearch
manticoresearch copied to clipboard
Equals sign in search query breaks request
The bug reported here can be reproduced with HTTP request as well. How to reproduce:
-
create an index with the following settings:
create table t(f text) index_exact_words='1' expand_keywords='1'; -
make a select query to the index which has a match clause ending with '=' character, e.g. :
curl -v POST http://localhost:9308/sql --data-urlencode "query=select * from t where match('test=')" -
the following error message is sent in response from Manticore:
{"error":"syntax error, unexpected $end near ''"}
UPDATE
It was reported in the original issue that escaping does eliminate the error message but the query still returns incorrect empty result.
I dont think that is technically a 'bug' as such. = is a extended query operator, so has special limitations where it can be used. Its only meant ot be used before a keyword (to imply exact form match only)
If you not expecting = to be treated as a operator, then escape it.
... ie run the query as test\=
(its a bit more complicated in actual queries, as the \ might itself need escaping in the 'single quoted' sql string')
select * from t where match('test\\=')
Might also need escaping again in the curl command line, not sure!)
@barryhunter thank you for your answer. I've updated the issue taking it into account.
Re, the 'update'. Escaping the = should work, and then as not in (default) charset_table, will be treated as a seperator. So will match the same as 'test' on it is own.
But will get no results in above example, because didn't insert any data into the index! The index is empty.
mysql> drop table if exists t; create table t(f text) index_exact_words='1'; insert into t values(0,'Product 123 name='); select * from t where match('name\\=');
+-------------------+-------------------+
| id | f |
+-------------------+-------------------+
| 73773252815618049 | Product 123 name= | <-- the row is returned
+-------------------+-------------------+
mysql> show meta;
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| total | 1 |
| total_found | 1 |
| total_relation | eq |
| time | 0.004 |
| keyword[0] | name | <--- the = is not shown as it a seperator.
| docs[0] | 1 |
| hits[0] | 1 |
+----------------+-------+
mysql>
Tested at https://play.manticoresearch.com/replication/
I think the MRE is:
mysql> drop table if exists t; create table t(f text) index_exact_words='1' expand_keywords='1'; insert into t values(1,'a'); select * from t where match('a\\='); select *
from t where match('a\\-');
--------------
drop table if exists t
--------------
Query OK, 0 rows affected (0.13 sec)
--------------
create table t(f text) index_exact_words='1' expand_keywords='1'
--------------
Query OK, 0 rows affected (0.00 sec)
--------------
insert into t values(1,'a')
--------------
Query OK, 1 row affected (0.01 sec)
--------------
select * from t where match('a\\=')
--------------
Empty set (0.00 sec)
--------------
select * from t where match('a\\-')
--------------
+------+------+
| id | f |
+------+------+
| 1 | a |
+------+------+
1 row in set (0.00 sec)
Problem: match('a\\=') is expected to find a since = is not in the charset_table, but it doesn't.
As an example a\\- works fine (finds a).
Let's compare the metas:
a=:
show meta
--------------
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| total | 0 |
| total_found | 0 |
| total_relation | eq |
| time | 0.000 |
| keyword[0] | =a= |
| docs[0] | 0 |
| hits[0] | 0 |
| keyword[1] | a= |
| docs[1] | 0 |
| hits[1] | 0 |
+----------------+-------+
10 rows in set (0.00 sec)
vs
a- :
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| total | 1 |
| total_found | 1 |
| total_relation | eq |
| time | 0.000 |
| keyword[0] | =a |
| docs[0] | 1 |
| hits[0] | 1 |
| keyword[1] | a |
| docs[1] | 1 |
| hits[1] | 1 |
+----------------+-------+
10 rows in set (0.00 sec)
As we can see, escaping of = and - differ. Seems that when = is escaped it's then get tokenized wrong (like if it was in the charset_table):
mysql> drop table if exists t; create table t(f text) index_exact_words='1' expand_keywords='1' charset_table='non_cjk, ='; insert into t values(1,'a='); select * from t where match('a\\='); select * from t where match('a\\='); show meta;
--------------
drop table if exists t
--------------
Query OK, 0 rows affected (0.13 sec)
--------------
create table t(f text) index_exact_words='1' expand_keywords='1' charset_table='non_cjk, ='
--------------
Query OK, 0 rows affected (0.00 sec)
--------------
insert into t values(1,'a=')
--------------
Query OK, 1 row affected (0.00 sec)
--------------
select * from t where match('a\\=')
--------------
+------+------+
| id | f |
+------+------+
| 1 | a= |
+------+------+
1 row in set (0.00 sec)
--------------
select * from t where match('a\\=')
--------------
+------+------+
| id | f |
+------+------+
| 1 | a= |
+------+------+
1 row in set (0.00 sec)
--------------
show meta
--------------
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| total | 1 |
| total_found | 1 |
| total_relation | eq |
| time | 0.000 |
| keyword[0] | =a= |
| docs[0] | 1 |
| hits[0] | 1 |
| keyword[1] | a= |
| docs[1] | 1 |
| hits[1] | 1 |
+----------------+-------+
10 rows in set (0.00 sec)
Ah, it seems expand_keywords is critical to reproduce it. With that can reproduce. Sorry for the confusion.
fixed at the 85057e7b
You need to update daemon from dev repo package to get issue fixed.