qdrant-client icon indicating copy to clipboard operation
qdrant-client copied to clipboard

Format error in JSON body: data did not match any variant of untagged enum VectorStruct

Open achillesliu opened this issue 1 year ago • 8 comments

python lib qdrant-client raises an error when upserting sparse vector

Current Behavior

the traceback is like the following

UnexpectedResponse Traceback (most recent call last) Cell In[17], line 1 ----> 1 qclient.upsert( 2 collection_name=collection_name, 3 points=[ 4 models.PointStruct( 5 id=1, 6 payload={'metadata': 'metadata'}, 7 vector={ 8 # 'default_dense': [1.] * 1024, 9 'text': models.SparseVector( 10 indices=[], 11 values=[], 12 ), 13 }, 14 ) 15 # for text, sparse_dict, dense_vec in zip( 16 # hypo_answers_original[:100], 17 # # sparse_emb_answers['lexical_weights'], 18 # dense_emb_answers['dense_vecs'][:100], 19 # dense_emb_answers['dense_vecs'][:100], 20 # ) 21 ] 22 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/qdrant_client.py:1349, in QdrantClient.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs) 1321 """ 1322 Update or insert a new point into the collection. 1323 (...) 1345 Operation Result(UpdateResult) 1346 """ 1347 assert len(kwargs) == 0, f"Unknown arguments: {list(kwargs.keys())}" -> 1349 return self._client.upsert( 1350 collection_name=collection_name, 1351 points=points, 1352 wait=wait, 1353 ordering=ordering, 1354 shard_key_selector=shard_key_selector, 1355 **kwargs, 1356 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/qdrant_remote.py:1756, in QdrantRemote.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs) 1753 if isinstance(points, models.Batch): 1754 points = models.PointsBatch(batch=points, shard_key=shard_key_selector) -> 1756 http_result = self.openapi_client.points_api.upsert_points( 1757 collection_name=collection_name, 1758 wait=wait, 1759 point_insert_operations=points, 1760 ordering=ordering, 1761 ).result 1762 assert http_result is not None, "Upsert returned None result" 1763 return http_result

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api/points_api.py:1667, in SyncPointsApi.upsert_points(self, collection_name, wait, ordering, point_insert_operations) 1657 def upsert_points( 1658 self, 1659 collection_name: str, (...) 1662 point_insert_operations: m.PointInsertOperations = None, 1663 ) -> m.InlineResponse2006: 1664 """ 1665 Perform insert + updates on points. If point with given ID already exists - it will be overwritten. 1666 """ -> 1667 return self._build_for_upsert_points( 1668 collection_name=collection_name, 1669 wait=wait, 1670 ordering=ordering, 1671 point_insert_operations=point_insert_operations, 1672 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api/points_api.py:852, in _PointsApi.build_for_upsert_points(self, collection_name, wait, ordering, point_insert_operations) 850 if "Content-Type" not in headers: 851 headers["Content-Type"] = "application/json" --> 852 return self.api_client.request( 853 type=m.InlineResponse2006, 854 method="PUT", 855 url="/collections/{collection_name}/points", 856 headers=headers if headers else None, 857 path_params=path_params, 858 params=query_params, 859 content=body, 860 )

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api_client.py:79, in ApiClient.request(self, type_, method, url, path_params, **kwargs) 77 kwargs["timeout"] = int(kwargs["params"]["timeout"]) 78 request = self.client.build_request(method, url, **kwargs) ---> 79 return self.send(request, type)

File ~/miniconda3/envs/general/lib/python3.8/site-packages/qdrant_client/http/api_client.py:102, in ApiClient.send(self, request, type_) 100 except ValidationError as e: 101 raise ResponseHandlingException(e) --> 102 raise UnexpectedResponse.for_response(response)

UnexpectedResponse: Unexpected Response: 400 (Bad Request) Raw response content: b'{"status":{"error":"Format error in JSON body: data did not match any variant of untagged enum VectorStruct at line 1 column 63"},"time":0.0}'

Steps to Reproduce

  1. Setup conda env with python 3.8 or 3.10 (both tried and error found), qdrant clinet version 1.7 and 1.10 and 1.11 tried (all failed)
  2. Setup qdrant docker container with sudo docker run -d -p 6333:6333 -v qdrant_data:/qdrant/storage:z qdrant/qdrant
  3. Create a qdrant collection with the following code
qclient = QdrantClient(url="http://localhost:6333")
collection_name = 'test_multiple_vectors'
qclient.recreate_collection(
    collection_name=collection_name,
    vectors_config={
        'default_dense': models.VectorParams(size=1024, distance=models.Distance.COSINE)
    },
    sparse_vectors_config={
        "text": models.SparseVectorParams(index=models.SparseIndexParams(on_disk=False))
    },
)
  1. upsert with the following code
qclient.upsert(
    collection_name=collection_name,
    points=[
        models.PointStruct(
            id=1,
            vector={
                'text': models.SparseVector(
                    indices=[1, 3, ],
                    values=[0.1 0.3, ],
                ),
            },
        )

Possible Solution

Tried making a new conda env but still got an error

achillesliu avatar Aug 19 '24 11:08 achillesliu

Update: with the qdrant clinet created like the following: client = QdrantClient(':memory'), all the things works

achillesliu avatar Aug 19 '24 12:08 achillesliu

hi @achillesliu, could you please tell us which qdrant version you are using?

unfortunately, we could not reproduce the issue

joein avatar Aug 21 '24 17:08 joein

I am also facing the same issue with sparse embedding.

using qdrant-client version 1.10.1

using sparse model: prithivida/Splade_PP_en_v1

ModelTorquie avatar Aug 29 '24 12:08 ModelTorquie

@DhavalArGEP hi, could you please try out qdrant-client 1.11.1 and qdrant at least 1.10 (better 1.11.1) Unfortunately, we could not reproduce it ourselves

joein avatar Aug 29 '24 13:08 joein

@joein with qdrant-client 1.11.1 also I am facing the same issue.

ModelTorquie avatar Aug 30 '24 09:08 ModelTorquie

@DhavalArGEP is it reproducible with the code provided in the issue, or could you maybe provide your own?

Which version of qdrant (not qdrant-client) are you using?

joein avatar Aug 30 '24 11:08 joein

@joein I am not able to provide code as it is used in my organization. But the same code is working on windows platform but not on Linux (I am using Docker).

ModelTorquie avatar Sep 02 '24 07:09 ModelTorquie

@DhavalArGEP can you create a minimal reproducible example? Not exactly the same code you use, but the one which is enough for pinpointing the issue

Could you also tell us the version of qdrant you are using?

joein avatar Sep 02 '24 10:09 joein

This issue got resolved by upgrading qdrant-client and sentence-transformer packages to a latest version. Thank you.

ModelTorquie avatar Oct 11 '24 11:10 ModelTorquie