manticoresearch icon indicating copy to clipboard operation
manticoresearch copied to clipboard

Errors calculating facets for json fields

Open yuliy opened this issue 2 years ago • 4 comments

Environment

# searchd --version  
Manticore 4.0.3 294ecb58e@211001 release  
  
# uname -a  
Linux 3f4cc3b9b1d1 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux  

Bug description

Hello! I met a problem when facets calculation is buggy. To reproduce the problems create index as follows:

create table items(title text, attrs json);  

Then add these data:

insert into items(title, attrs) values('abc', '{"key": [1,2,3]}');  
insert into items(title, attrs) values('abc', '{"key": [2,3,4]}');  

insert into items(title, attrs) values('abc', '{"key": [4,5,2147483647]}');  
insert into items(title, attrs) values('abc', '{"key": [4,5,2147483648]}');  

Problem 1 (sep. issue - https://github.com/manticoresoftware/manticoresearch/issues/807)

If I use alias in facets, internal lists are treated as strings.

mysql> select id from items facet attrs.key as k;  
 ---------------------   
| id                  |  
 ---------------------   
| 4180995377425547270 |  
| 4180995377425547271 |  
| 4180995377425547272 |  
| 4180995377425547269 |  
 ---------------------   
4 rows in set (0.00 sec)  
  
 ------------------ ----------   
| k                | count(*) |  
 ------------------ ----------   
| [4,5,2147483648] |        1 |  
| [4,5,2147483647] |        1 |  
| [4,5,2147483647] |        1 |  
| [2,3,4]          |        2 |  
| [1,2,3]          |        2 |  
| [1,2,3]          |        2 |  
| [1,2,3]          |        1 |  
 ------------------ ----------   

Problem 2.

I can insert invalid json data where numbers exceed 2'147'483'647 (max of int32):

mysql> select * from items;  
 --------------------- -------------------------- -------   
| id                  | attrs                    | title |  
 --------------------- -------------------------- -------   
| 4180995377425547270 | {"key":[2,3,4]}          | abc   |  
| 4180995377425547271 | {"key":[4,5,2147483647]} | abc   |  
| 4180995377425547272 | {"key":[4,5,2147483648]} | abc   |  
| 4180995377425547269 | {"key":[1,2,3]}          | abc   |  
 --------------------- -------------------------- -------   

When I try calculating facets for this data I get unexpected results:

mysql> select id from items facet attrs.key;  
 ---------------------   
| id                  |  
 ---------------------   
| 4180995377425547270 |  
| 4180995377425547271 |  
| 4180995377425547272 |  
| 4180995377425547269 |  
 ---------------------   
4 rows in set (0.00 sec)  
  
 ------------------ ----------   
| attrs.key        | count(*) |  
 ------------------ ----------   
| [4,5,2147483648] |        1 |  
| 2147483647       |        1 |  
| 5                |        1 |  
| 4                |        2 |  
| 3                |        2 |  
| 2                |        2 |  
| 1                |        1 |  
 ------------------ ----------   

yuliy avatar Jun 15 '22 16:06 yuliy

could you try dev package version as facet statement had some fixes at 502 release and after it

tomatolog avatar Jun 15 '22 19:06 tomatolog

could you try dev package version as facet statement had some fixes at 502 release and after it

ok, I'll try it. Would you please tell me where can I find it?

yuliy avatar Jun 15 '22 20:06 yuliy

our manual has a topic on that Debian and Ubuntu Development packages

tomatolog avatar Jun 15 '22 20:06 tomatolog

I've checked nightly build:

root@b2902ebb3006:/var/lib/manticore# searchd --version
Manticore 5.0.3 75899faff@220612 -dev (columnar 1.15.5 a2059e2@220523) (secondary 1.15.5 a2059e2@220523)

Both problems are reproduced on dev build either :(

yuliy avatar Jun 16 '22 14:06 yuliy