manticoresearch
manticoresearch copied to clipboard
Difference between count(*) and number of results from select * from ....
Hi,
I have the following weird situation, if I made this query with count(*) the result of the count is 1372
but if I made the query with the same parameters the number of the elements are 1159
but if I add sequence_number > 1158 the result is 213 elements
so 213 1159=1372, the sequence_number is ordered from 0 to 1371
I'm currently running Manticore 3.6.0, and the total records are 5485931
here my manticore.conf
index rt1
{
type = rt
min_infix_len = 3
.
.
.
html_remove_elements = style, script, embed, video, audio, img, canvas, svg, meta
html_strip = 1
access_plain_attrs = mmap
path = /..../sphinx/rt1/sphinx
rt_mem_limit = 512M
}
searchd
{
listen = 127.0.0.1:9312
listen = 127.0.0.1:9306:mysql41
log = /var/log/manticore/searchd.log
network_timeout = 60
persistent_connections_limit = 150
pid_file = /var/run/manticore/searchd.pid
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
workers = threads
binlog_path = /.../sphinx/binlog/
binlog_flush = 2
max_open_files = 50000
}
common
{
plugin_dir = /usr/local/sphinx/lib
}
index rtall
{
type = distributed
local = rt1
}
What is the problem?
I have to wonder if 10B for max_matches, is simply way too big, such that it being ignored. (or worse integer overflowing/wrapping!)
(setting max_matches, could result in searchd trying allocate 80Gb of memory! assuming a conserrvative 8 bytes per document for just holding the 64bit id, in reality its propbably more as it has to allocate storage for attriubtes, at least the order by attribute anyway)
Try setting a more realistic value like 2000, or even say 10,000 - if you want it higher than default 1000.
If the lower max_matches doesn't help try select id from ... where ... option max_matches=1; show meta;
. In theory, it may be also some ids duplication issue.
I updated query as you suggested by with no differences
and with changing zip_name condition I obtain always 1159 records
also with no conditions I obtain 1159 records
I also launched and Optimization of index with max_chunks = 2, I stop/started manticore, but nothing changes.
What should be and why returns always?
Can you send us the indexes/config/sample queries (as text) to our write-only ftp? See https://mnt.cr/ftp
@sanikolaev data uploaded, let me know if you need more information.
@sanikolaev Any updates?
Hi @ivanghisleni
It indeed looks wrong:
1159 docs in show index status
:
mysql> show index rt1 status;
+-----------------------------+---------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+-----------------------------+---------------------------------------------------------------------------------------------------------+
| index_type | rt |
| indexed_documents | 1159 |
| indexed_bytes | 62803273811 |
...
| killed_documents | 0 |
...
but 5.4M in select count
:
mysql> select count(*) from rt1;
+----------+
| count(*) |
+----------+
| 5485931 |
+----------+
1 row in set (0.12 sec)
No dulplicates detected by indextool --check
:
snikolaev@dev:/home/manticorebugs/ftproot/github-issue-668/github-issue-668$ sudo indextool -c sphinx.conf --check rt1 --check-id-dups
Manticore 4.2.0 15e927b@211223 release (columnar 1.11.4 327b3d4@211223)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2021, Manticore Software LTD (https://manticoresearch.com)
using config file 'sphinx.conf'...
WARNING: key 'workers' is deprecated in sphinx.conf line 90; use 'default value' instead.
checking index 'rt1'...
WARNING: failed to load RAM chunks, checking only 1 disk chunks
checking schema...
checking disk chunk, extension 129, 0(1)...
checking schema...
checking dictionary...
checking data...
checking rows...
checking attribute blocks index...
checking kill-list...
checking dead row map...
checking doc-id lookup...
checking docid douplicates ...
check passed, 109.4 sec elapsed
check passed, 109.4 sec elapsed
Looks like the millions of documents have ids and are filterable:
mysql> select count(*) from rt1 where id > 1604609627439935;
+----------+
| count(*) |
+----------+
| 2478042 |
+----------+
1 row in set (0.10 sec)
and the docs really exist:
mysql> select deleted, count(distinct id) from rt1 group by deleted;
+---------+--------------------+
| deleted | count(distinct id) |
+---------+--------------------+
| 1 | 5485931 |
+---------+--------------------+
1 row in set (0.46 sec)
but I can't fetch more than 1159 docs:
mysql> select * from rt1 where deleted = 1 limit 2000 option max_matches=2000;
...
1159 rows in set (0.15 sec)
Searching through full-text only also finds millions of docs:
mysql> select * from rt1 where match('a') limit 0; show meta;
Empty set (0.30 sec)
+---------------+----------+
| Variable_name | Value |
+---------------+----------+
| total | 1 |
| total_found | 3150661 |
The suspicious thing is:
index_token_filter = archiver_sphinx_integration.so:untokenize_dot_at:
Not sure if it has anything to do with the issue, but I'd try to disable it completely, rebuild the index from scratch and see if it helps.
@sanikolaev if it can be useful I can add the code about archiver_sphinx_integration.so:untokenize_dot_at
My intention is to understand why this condition happen and, if possible, avoid it in the future without rebuild index.
Let's try. Can you then share the plugin code and examples of the queries you use to populate the index?
I uploaded into ftp space these files:
-
archiver_sphinx_integration.c
-
insert_records_sample.php
(sample how the record is imported into Manticore)
Maybe can be useful, the issue happens after 5M deleted records (one by one), every time a record id deleted the script check on count by zip_name.
I also uploaded a sample how record can be deleted: delete_records_sample.php
@sanikolaev did you find something could explain the issue?
Hi. Unfortunately not. Didn't have much time this week. Hopefully will be able to look into the issue next week.
ok, I'll wait for your feedback.
@ivanghisleni from your insert query example I can't understand what the actual queries are. Can you please provide few examples in SQL format, so I can use it to generate few million docs index.
@sanikolaev I uploaded a sql sample of insert query
I need sample values too, not only fields list. At least just one real query.
Uploaded a real sample insert_records_sample.sql
I've created an RT index based on the sample record and couldn't reproduce the issue:
mysql> select count(*) from rt1;
+----------+
| count(*) |
+----------+
| 5000000 |
+----------+
1 row in set (0.07 sec)
mysql> show index rt1 status;
+-----------------------------+---------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+-----------------------------+---------------------------------------------------------------------------------------------------------+
| index_type | rt |
| indexed_documents | 5000000 |
| indexed_bytes | 3335000000 |
| ram_bytes | 1198000463 |
| disk_bytes | 2747357127 |
| disk_mapped | 1110670364 |
| disk_mapped_cached | 688775168 |
| disk_mapped_doclists | 0 |
| disk_mapped_cached_doclists | 0 |
| disk_mapped_hitlists | 0 |
| disk_mapped_cached_hitlists | 0 |
| killed_documents | 0 |
| killed_rate | 0.00% |
| ram_chunk | 509193751 |
| ram_chunk_segments_count | 31 |
| disk_chunks | 7 |
| mem_limit | 536870912 |
| mem_limit_rate | 95.00% |
| ram_bytes_retired | 0 |
| tid | 0 |
| tid_saved | 0 |
| query_time_1min | {"queries":2, "avg_sec":0.001, "min_sec":0.001, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_5min | {"queries":2, "avg_sec":0.001, "min_sec":0.001, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_15min | {"queries":2, "avg_sec":0.001, "min_sec":0.001, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_total | {"queries":29, "avg_sec":0.002, "min_sec":0.000, "max_sec":0.016, "pct95_sec":0.015, "pct99_sec":0.016} |
| found_rows_1min | {"queries":2, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_5min | {"queries":2, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_15min | {"queries":2, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_total | {"queries":29, "avg":5099, "min":1, "max":147851, "pct95":3697, "pct99":147851} |
+-----------------------------+---------------------------------------------------------------------------------------------------------+
29 rows in set (0.01 sec)
@ivanghisleni would it be possible to try to reproduce the problem on your side?
I mean by recreating the index from scratch.
@sanikolaev, how did you simulate the isuue?
I identified the issue after these steps:
- created ~10M records
- deleted ~5M records, one by one
delete from rt1 where id = ....
- appeared the issue
Did you identified any anomalies on the index bug affected?
By the way, I'll try to reproduce again the issue on my side.
I've tried again:
- created 10M (actually 10M + 1000) records
- deleted 5M records one by one
and all looks good to me:
mysql> show index rt1 status; select count(*) from rt1;
+-----------------------------+-----------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+-----------------------------+-----------------------------------------------------------------------------------------------------------+
| index_type | rt |
| indexed_documents | 5001000 |
| indexed_bytes | 6670667000 |
| ram_bytes | 1318392 |
| disk_bytes | 6120285122 |
| disk_mapped | 2474206301 |
| disk_mapped_cached | 1253376 |
| disk_mapped_doclists | 0 |
| disk_mapped_cached_doclists | 0 |
| disk_mapped_hitlists | 0 |
| disk_mapped_cached_hitlists | 0 |
| killed_documents | 4663000 |
| killed_rate | 93.24% |
| ram_chunk | 0 |
| ram_chunk_segments_count | 0 |
| disk_chunks | 15 |
| mem_limit | 536870912 |
| mem_limit_rate | 95.00% |
| ram_bytes_retired | 0 |
| tid | 0 |
| tid_saved | 0 |
| query_time_1min | {"queries":3, "avg_sec":0.086, "min_sec":0.039, "max_sec":0.178, "pct95_sec":0.178, "pct99_sec":0.178} |
| query_time_5min | {"queries":3, "avg_sec":0.086, "min_sec":0.039, "max_sec":0.178, "pct95_sec":0.178, "pct99_sec":0.178} |
| query_time_15min | {"queries":3, "avg_sec":0.086, "min_sec":0.039, "max_sec":0.178, "pct95_sec":0.178, "pct99_sec":0.178} |
| query_time_total | {"queries":149, "avg_sec":0.201, "min_sec":0.000, "max_sec":18.418, "pct95_sec":0.105, "pct99_sec":7.635} |
| found_rows_1min | {"queries":3, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_5min | {"queries":3, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_15min | {"queries":3, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1} |
| found_rows_total | {"queries":149, "avg":373323, "min":1, "max":10001000, "pct95":3697, "pct99":10001000} |
+-----------------------------+-----------------------------------------------------------------------------------------------------------+
29 rows in set (0.04 sec)
+----------+
| count(*) |
+----------+
| 5001000 |
+----------+
1 row in set (0.04 sec)
Didn't found any evidence from the index I gave you?
Not yet. I hoped there was a way to reproduce it by recreating the index.
By the way, I'll try to reproduce again the issue on my side.
Did you manage do do it?
Yes I did, but no luck.
Any news?
No. Since neither you nor we can reproduce the issue chances are the index on which the issue can be reproduced is corrupted. We are still interested in understanding what's wrong with it and why indextool --check
doesn't report it's corrupted, but the priority of this issue is low since:
- there are other tasks much more people are waiting for to be complete
- and our team's resource is limited
If the issue is mission critical for you - we do professional services.