RediSearch can't search text with '-'

i have some data store in mongodb. the data example is here

` /* 1 */ { "_id" : "94cfcc6a-a5ea-4e34-a368-dc9111356aa4", "uid" : NumberLong(86590), "app" : "znt", "device_id" : "94f435c7-e36f-42de-8cc5-fb44b2fa3eb8", "create_time" : ISODate("2021-11-26T05:41:13.145Z"), "update_time" : ISODate("2022-03-07T07:50:59.572Z") }

/* 2 */ { "_id" : "2a8e1609-7631-420c-84d7-fd44c9e6342e", "uid" : NumberLong(1801978), "app" : "znt", "device_id" : "1bb4a0b7-ae38-4688-ab8b-d3a70c071f03", "create_time" : ISODate("2022-01-14T06:41:12.373Z"), "update_time" : ISODate("2022-02-10T01:55:07.317Z") }

/* 3 */ { "_id" : "fe3b2dfe-bc6a-4da2-afd7-9a539d81a99a", "uid" : NumberLong(1805564), "app" : "znt", "device_id" : "be7a3ecf-1f7a-49b6-af78-686af6199bd4", "create_time" : ISODate("2022-01-17T08:23:39.382Z"), "update_time" : ISODate("2022-02-09T08:53:58.685Z") } `

i need to search the data with uid, app and device_id the uid, app and device_id are dosn't need segment, the value need exact match.

my index is here:

ft.create idx_login_token on json prefix 1 login_token: schema $.uid as uid numeric $.app as app text nostem $.device_id as device_id text nostem

at first, i have tried the tag type, but it dos't work, according to the doc, the tag field also segment the content

127.0.0.1:6379> json.set login_token:94cfcc6a-a5ea-4e34-a368-dc9111356aa4 $ '{"uid":2202221832500000077,"app":"znt","device_id":"94f435c7-e36f-42de-8cc5-fb44b2fa3eb8","create_time":"2021-11-26T05:41:13.145Z","update_time":"2022-02-21T02:40:20.241Z"}' OK 127.0.0.1:6379> json.set login_token:2a8e1609-7631-420c-84d7-fd44c9e6342e $ '{"uid":2202221832500000077,"app":"znt","device_id":"1bb4a0b7-ae38-4688-ab8b-d3a70c071f03","create_time":"2021-11-26T05:41:13.145Z","update_time":"2022-02-21T02:40:20.241Z"}' OK 127.0.0.1:6379> json.set login_token:fe3b2dfe-bc6a-4da2-afd7-9a539d81a99a $ '{"uid":1805564,"app":"znt","device_id":"be7a3ecf-1f7a-49b6-af78-686af6199bd4","create_time":"2022-01-17T08:23:39.382Z","update_time":"2022-02-21T02:40:20.241Z"}' OK

i write 3 docs to test

the index seems ok

the first i search the data via uid, that is a long number, i test the ft.explain idx_login_token '@uid:[2202221832500000077]' ft.explain idx_login_token '@uid:2202221832500000077' ft.explain idx_login_token '@uid:[2202221832500000077 2202221832500000077]'

the last one works the output point out the number lost the precision, but the result is correct

and the exact match rule is not fell well.

the app query works

but i can't search data via device_id at first, i wrote the query direct

that indirect that the query treat - as exclude. so i have test the follow, i escape the '-'

but the result is also empty i quote the query

the server report query rule error, i quote the data and escape the character '-', but i can't search any thing

Mar 07 '22 10:03 cjdxhjj

I found that replacing - with works. So searching with @device_id:1bb4a0b7 ae38 4688 ab8b d3a70c071f03 returns the right results.

Mar 07 '22 22:03 michaelbukachi

@michaelbukachi that hit the union match rule. that was not the exact match. when another string contains all of that substring, it will match

Mar 08 '22 01:03 cjdxhjj

@cjdxhjj since you are dealing with UUIDs , that is highly unlikely to happen. For normal strings though, this might not work.

Mar 08 '22 13:03 michaelbukachi

You need to escape the "-", see https://oss.redis.com/redisearch/Escaping/

Mar 08 '22 15:03 kkmuffme

thanks very much, i will have a try, but i wonder if the platform provider an data type like keyword in es. the string dosn't segment and can exact match

Mar 09 '22 02:03 cjdxhjj

I'm facing the same issue, I got a JSON document with a UUID but I can't search it by UUID. Maybe a UUID type could avoid this behavior.

Apr 14 '22 23:04 goldyfruit

You need to escape the - when searching as I wrote above. Not that hard guys...

Apr 15 '22 06:04 kkmuffme

You need to escape the - when searching as I wrote above. Not that hard guys...

I never said that it was hard, just that having a dedicated field like UUID could avoid the index process to split the text.

PS: The link from above doesn't work anymore.

Apr 15 '22 12:04 goldyfruit

No, you misunderstand how escaping works. You need to escape BOTH when you index AND when you query. Then it will not split the text and search it as if it were 1 string.

Apr 15 '22 12:04 kkmuffme

No, you misunderstand how escaping works. You need to escape BOTH when you index AND when you query. Then it will not split the text and search it as if it were 1 string.

Ok

Apr 15 '22 13:04 goldyfruit

you need escape before store it and before search it

Apr 16 '22 02:04 cjdxhjj

You need to escape the "-", see https://oss.redis.com/redisearch/Escaping/

Updated link: https://redis.io/docs/stack/search/reference/escaping/

Apr 24 '22 10:04 oshadmi

Would it make sense maybe to add a field type BINARY to the index field types, which omit stemming and tokenization?

Sometimes you are looking for exact string matches, in my example I have a (user supplied) username in my JSON which I would like to index, but I don't want to modify it inside the JSON to escape a (hopefully correct) list of separators, just because this is the only way, I will be able to make searches work.

Jul 31 '22 19:07 domoran

I'm wondering how can I specify a custom analyzer sth like a "KeywordAnalyzer" to avoid tokenizing the text field.

Jul 31 '23 11:07 lduffy69

i had the same issue, what worked for me, was removing the "-" on both indexing and searching

Sep 12 '23 17:09 aibarra11

RediSearch RediSearch copied to clipboard

can't search text with '-'

RediSearch
RediSearch copied to clipboard