milvus-sdk-java icon indicating copy to clipboard operation
milvus-sdk-java copied to clipboard

Inconsistent results returned for the same query

Open ministat opened this issue 2 years ago • 6 comments

Milvus cluster: v2.0.1 milvus-sdk-java: 2.0.4


    public static void runQuickSearch(String collectionName) {
        DescIndexResponseWrapper.IndexDesc indexDesc =
                describeIndexInfo(collectionName)
                        .getIndexDescByFieldName(PROPERTIES.getProperty("VECTOR_FIELD"));
        int topK = 20;
        int nq = 5;
        final List<List<Float>> vectors = new ArrayList();
        List<Long> ids = new ArrayList();
        for (int i = 0; i < nq; i++) {
            vectors.add(QUERY_EMBEDDINGS.get(i));
            ids.add((long)i);
        }

        final String SEARCH_PARAM = "{\"nprobe\":64}";
        SearchParam searchParam = SearchParam.newBuilder()
                .withCollectionName(collectionName)
                .withTopK(topK)
                .withMetricType(indexDesc.getMetricType())
                .withVectors(vectors)
                .withVectorFieldName(PROPERTIES.getProperty("VECTOR_FIELD"))
                .withParams(SEARCH_PARAM)
                .build();
        long begin = System.currentTimeMillis();
        R<SearchResults> response = milvusClient.search(searchParam);
        long end = System.currentTimeMillis();
        long cost = end - begin;
        System.out.println("Search time cost: " + cost + "ms");
        handleResponseStatus(response);
        SearchResultsWrapper wrapper = new SearchResultsWrapper(response.getData().getResults());
        List<List<Long>> results = new ArrayList();
        for (int i = 0; i < vectors.size(); ++i) {
            System.out.println("Search result of No." + i);
            List<SearchResultsWrapper.IDScore> scores = wrapper.getIDScore(i);
            System.out.println(scores);
            List<Long> result = new ArrayList();
            for (SearchResultsWrapper.IDScore score : scores) {
                result.add(score.getLongID());
            }
            if (!result.isEmpty()) {
                results.add(result);
            }
        }
    }

The output indicates the first ID of No.0 query is different: 87356234 vs. 504814 ... Search time cost: 369ms Search result of No.0 [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)] ...

Search result of No.0 [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)] ...

ministat avatar Mar 14 '22 06:03 ministat

Do you mean the first search returns [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ], the second search returns [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ] ?

Any operations(delete action?) between the two search requests?

yhmo avatar Mar 14 '22 07:03 yhmo

Yes, you see they are different. No other operations between them. The data set is 100m used in milvus_bootcamp.

ministat avatar Mar 14 '22 09:03 ministat

Could you show me the "SEARCH_PARAM"? Is the index IVF_FALT? What is the value of "nlist"? What is the value of "nprobe" in the "SEARCH_PARAM"?

yhmo avatar Mar 17 '22 02:03 yhmo

Is it possible to provide a reproducible steps that we can debug into the source code?

yhmo avatar Mar 17 '22 02:03 yhmo

SEARCH_PARAM=64. The index is IVF_SQ8. The dataset is 100m, and I follow the steps to create the index: in https://github.com/milvus-io/bootcamp/blob/master/benchmark_test/lab2_sift1b_100m.md

I have created a Java program to reproduce this issue. That program is sent to Milvus community. Hope you will receive it. Please run that program:

java -jar target/milvus-benchmark-1.0-SNAPSHOT.jar -a QUICKSEARCH -c ann_100m_sq8

Search time cost: 477ms Search result of No.0 [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)] …… Search time cost: 334ms Search result of No.0 [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)]

ministat avatar Mar 17 '22 11:03 ministat

I have uploaded my Java program to: https://github.com/ministat/milvus-unstable-results. Please check whether you can reproduce this issue.

ministat avatar Mar 18 '22 01:03 ministat