milvus
milvus copied to clipboard
[Bug]: Fail to search on QueryNode: When metricType is L2 and range_filter < radius, precision loss error occurs. The range_filter must be less than the radius.
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version: master-20240415-70e3d5b-909836f
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka): None
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
- An unexpected error was encountered when using SearchIterator: range_filter (101492340) must be less than radius (101492340).
- Upon further investigation, it was found that when using the search interface, although the documentation states that both range_filter and radius are of float type, the actual validation involves precision loss before comparison. For example, specifying search_params.
search_params = {
"metric_type": "L2",
"params": {"radius": 101492336.05, "range_filter": 101492336.0},
}
- Triggering the search interface will result in an error.
/usr/bin/python3 /Users/zilliz/work/code/pymilvus/examples/iterator_float16.py
RPC error: [search], <MilvusException: (code=65535, message=fail to search on QueryNode 1: worker(1) query failed: Assert "range_filter < radius" at /go/src/github.com/milvus-io/milvus/internal/core/src/common/RangeSearchHelper.cpp:128
=> range_filter(101492340) must be less than radius(101492340) for L2/HAMMING/JACCARD)>, <Time:{'RPC start': '2024-04-19 14:42:07.215498', 'RPC error': '2024-04-19 14:42:07.842446'}>
Traceback (most recent call last):
File "/Users/zilliz/work/code/pymilvus/examples/iterator_float16.py", line 87, in <module>
fp16_vector_search()
File "/Users/zilliz/work/code/pymilvus/examples/iterator_float16.py", line 68, in fp16_vector_search
res = hello_milvus.search(vectors_to_search, vector_field_name, search_params, limit=1)
File "/Users/zilliz/work/code/pymilvus/pymilvus/orm/collection.py", line 799, in search
resp = conn.search(
File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 140, in handler
raise e from e
File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 136, in handler
return func(*args, **kwargs)
File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 175, in handler
return func(self, *args, **kwargs)
File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 115, in handler
raise e from e
File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 86, in handler
return func(*args, **kwargs)
File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 798, in search
return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)
File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 739, in _execute_search
raise e from e
File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 732, in _execute_search
check_status(response.status)
File "/Users/zilliz/work/code/pymilvus/pymilvus/client/utils.py", line 62, in check_status
raise MilvusException(status.code, status.reason, status.error_code)
- In the error message, both the values of radius and range_filter have been changed to 101492340.
Expected Behavior
Precision loss should not occur, and queries should proceed normally.
Steps To Reproduce
Reproducing steps 2-4 consistently should yield the same results.
Milvus Log
No response
Anything else?
No response
Is there an existing issue for this?
- [x] I have searched the existing issues
Environment
- Milvus version: master-20240415-70e3d5b-909836f - Deployment mode(standalone or cluster):standalone - MQ type(rocksmq, pulsar or kafka): None - SDK version(e.g. pymilvus v2.0.0rc2):pymilvus - OS(Ubuntu or CentOS): - CPU/Memory: - GPU: - Others:
Current Behavior
- An unexpected error was encountered when using SearchIterator: range_filter (101492340) must be less than radius (101492340).
- Upon further investigation, it was found that when using the search interface, although the documentation states that both range_filter and radius are of float type, the actual validation involves precision loss before comparison. For example, specifying search_params.
search_params = { "metric_type": "L2", "params": {"radius": 101492336.05, "range_filter": 101492336.0}, }
- Triggering the search interface will result in an error.
/usr/bin/python3 /Users/zilliz/work/code/pymilvus/examples/iterator_float16.py RPC error: [search], <MilvusException: (code=65535, message=fail to search on QueryNode 1: worker(1) query failed: Assert "range_filter < radius" at /go/src/github.com/milvus-io/milvus/internal/core/src/common/RangeSearchHelper.cpp:128 => range_filter(101492340) must be less than radius(101492340) for L2/HAMMING/JACCARD)>, <Time:{'RPC start': '2024-04-19 14:42:07.215498', 'RPC error': '2024-04-19 14:42:07.842446'}> Traceback (most recent call last): File "/Users/zilliz/work/code/pymilvus/examples/iterator_float16.py", line 87, in <module> fp16_vector_search() File "/Users/zilliz/work/code/pymilvus/examples/iterator_float16.py", line 68, in fp16_vector_search res = hello_milvus.search(vectors_to_search, vector_field_name, search_params, limit=1) File "/Users/zilliz/work/code/pymilvus/pymilvus/orm/collection.py", line 799, in search resp = conn.search( File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 140, in handler raise e from e File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 136, in handler return func(*args, **kwargs) File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 175, in handler return func(self, *args, **kwargs) File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 115, in handler raise e from e File "/Users/zilliz/work/code/pymilvus/pymilvus/decorators.py", line 86, in handler return func(*args, **kwargs) File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 798, in search return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs) File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 739, in _execute_search raise e from e File "/Users/zilliz/work/code/pymilvus/pymilvus/client/grpc_handler.py", line 732, in _execute_search check_status(response.status) File "/Users/zilliz/work/code/pymilvus/pymilvus/client/utils.py", line 62, in check_status raise MilvusException(status.code, status.reason, status.error_code)
- In the error message, both the values of radius and range_filter have been changed to 101492340.
Expected Behavior
Precision loss should not occur, and queries should proceed normally.
Steps To Reproduce
Reproducing steps 2-4 consistently should yield the same results.
Milvus Log
No response
Anything else?
No response
does this value overflowed for Fp16?
The value 101492336.05 overflows for float16 but not for float32. Does this mean that the radius and range_filter parameters are of float16 type? I didn't see any specific indication in the documentation.
If I use the searchParams {"radius": 702336.05, "range_filter": 702336.0}, it works fine, even though 702336 overflows for float16.
Sparse vectors also have similar problems
/assign @liliu-z /unassign
The value 101492336.05 overflows for float16 but not for float32. Does this mean that the radius and range_filter parameters are of float16 type? I didn't see any specific indication in the documentation.
If I use the searchParams {"radius": 702336.05, "range_filter": 702336.0}, it works fine, even though 702336 overflows for float16.
This actually has nothing to do with whether it overflows float16. If the number of digits required exceeds the number of significant digits that the float type can store (23 bits), it may cause approximate representation.
It is impossible to avoid loss of precision. Actually, the 23 effective bits can already cover most scenarios. I think it is more appropriate to add some comments about precision loss in the documentation.
Hi @lentitude2tk , This issue is caused by float data type accuracy. The float data type can only retain up to six significant digits, therefore it is not possible to distinguish between '101492336.05' and '101492336.0' using float data type.
I write a simple .cc file
#include <iostream>
int main() {
float a = 101492336.05;
float b = 101492336.0;
std::cout << "a = " << a << std::endl;
std::cout << "b = " << b << std::endl;
if (a > b) {
std::cout << "a > b" << std::endl;
} else if (a == b) {
std::cout << "a == b" << std::endl;
} else {
std::cout << "a < b" << std::endl;
}
return 0;
}
The output is as this:
a = 1.01492e+08
b = 1.01492e+08
a == b
In c++, it regards 'a' and 'b' as equal.
@cydrain I probably understand that this is caused by C++'s precision problem with floating point numbers. But the problem caused by this is that when using searchIterator in pymilvus or java, it will continuously adjust the values of range_filter and radius to obtain a wider range of results. It will appear that the accuracy can be recognized normally in python, but after it is issued trigger error
@cydrain I probably understand that this is caused by C++'s precision problem with floating point numbers. But the problem caused by this is that when using searchIterator in pymilvus or java, it will continuously adjust the values of range_filter and radius to obtain a wider range of results. It will appear that the accuracy can be recognized normally in python, but after it is issued trigger error
Hi @lentitude2tk , I googled "c++ float", "python float" and "java float", whatever in C++, Python, or Java, the float data type always follows the IEEE 754 standard for single precision floating-point numbers. That means, even if search iterator in python or java gets a range filter from the distance of previous iteration, such as "101492336.05", only 1.01492e+08 is accurate, other digits are meaningless.
Is this dataset generated by ourselves ? If the radius and range_filter returned by your dataset in the previous search iteration are so close, it indicates that all of the query results have the same distances that are approximately equal to this radius. This indicates that the distribution of this dataset is not very good, and a more distributed dataset should be generated to avoid this problem.