bug: Slow vector parameter pass
Bug Report
YDB Python SDK version: 3.21.1 Python version: 3.8.10 OS: Linux-5.4.210-39.1.pagevecsize-x86_64-with-glibc2.29
Behavior:
- When I pass a vector to the query as a list, I have 127 RPS.
- When I pass a vector to the query as a string, I have 617 RPS.
First way is the default way in YDB vector search. It's used in langchain-ydb. But it's slower. Second way is undocumented way but it's much faster.
In C++ SDK we have numbers: 810 and 860 RPS.
Please, fix vector pass as a list in python SDK. 127 RPS is too slow.
See an example in the attached python file: vector-parameter.py.
You can change behaviour by these lines:
MODE = "list"
# MODE = "string"
@vgvoleg , please have a look
@asmyasnikov , please have a look
I take the attached python file: vector-parameter.py and changed: 1_ Line 13
# MODE = "list"
MODE = "string"
2_ Add bytes copy on line 58 in order to remove zero-copy.
parameters = {"$EmbeddingString": ydb.TypedValue(bytes(random.choice(embeddings_binary)), ydb.PrimitiveType.String)}
The result: the RPS is high. Conclusion: we should consider passing vector as serialized string.
Changes to documentation about serialization format https://github.com/ydb-platform/ydb/pull/22048