milvus-sdk-java
milvus-sdk-java copied to clipboard
Inconsistent results returned for the same query
Milvus cluster: v2.0.1 milvus-sdk-java: 2.0.4
public static void runQuickSearch(String collectionName) {
DescIndexResponseWrapper.IndexDesc indexDesc =
describeIndexInfo(collectionName)
.getIndexDescByFieldName(PROPERTIES.getProperty("VECTOR_FIELD"));
int topK = 20;
int nq = 5;
final List<List<Float>> vectors = new ArrayList();
List<Long> ids = new ArrayList();
for (int i = 0; i < nq; i++) {
vectors.add(QUERY_EMBEDDINGS.get(i));
ids.add((long)i);
}
final String SEARCH_PARAM = "{\"nprobe\":64}";
SearchParam searchParam = SearchParam.newBuilder()
.withCollectionName(collectionName)
.withTopK(topK)
.withMetricType(indexDesc.getMetricType())
.withVectors(vectors)
.withVectorFieldName(PROPERTIES.getProperty("VECTOR_FIELD"))
.withParams(SEARCH_PARAM)
.build();
long begin = System.currentTimeMillis();
R<SearchResults> response = milvusClient.search(searchParam);
long end = System.currentTimeMillis();
long cost = end - begin;
System.out.println("Search time cost: " + cost + "ms");
handleResponseStatus(response);
SearchResultsWrapper wrapper = new SearchResultsWrapper(response.getData().getResults());
List<List<Long>> results = new ArrayList();
for (int i = 0; i < vectors.size(); ++i) {
System.out.println("Search result of No." + i);
List<SearchResultsWrapper.IDScore> scores = wrapper.getIDScore(i);
System.out.println(scores);
List<Long> result = new ArrayList();
for (SearchResultsWrapper.IDScore score : scores) {
result.add(score.getLongID());
}
if (!result.isEmpty()) {
results.add(result);
}
}
}
The output indicates the first ID of No.0 query is different: 87356234 vs. 504814 ... Search time cost: 369ms Search result of No.0 [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)] ...
Search result of No.0 [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)] ...
Do you mean the first search returns [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ]
, the second search returns [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ]
?
Any operations(delete action?) between the two search requests?
Yes, you see they are different. No other operations between them. The data set is 100m used in milvus_bootcamp.
Could you show me the "SEARCH_PARAM"? Is the index IVF_FALT? What is the value of "nlist"? What is the value of "nprobe" in the "SEARCH_PARAM"?
Is it possible to provide a reproducible steps that we can debug into the source code?
SEARCH_PARAM=64. The index is IVF_SQ8. The dataset is 100m, and I follow the steps to create the index: in https://github.com/milvus-io/bootcamp/blob/master/benchmark_test/lab2_sift1b_100m.md
I have created a Java program to reproduce this issue. That program is sent to Milvus community. Hope you will receive it. Please run that program:
java -jar target/milvus-benchmark-1.0-SNAPSHOT.jar -a QUICKSEARCH -c ann_100m_sq8
Search time cost: 477ms Search result of No.0 [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)] …… Search time cost: 334ms Search result of No.0 [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)]
I have uploaded my Java program to: https://github.com/ministat/milvus-unstable-results. Please check whether you can reproduce this issue.