qdrant-client
qdrant-client copied to clipboard
Huge CPU usage of qdrant client when deserializing payload
Title: High CPU Usage with Qdrant Async Client for GRPC and REST Endpoints
Description:
We are experiencing significant CPU usage spikes when enabling the Qdrant async client for both GRPC and REST endpoints. By utilizing a profiling tool (Parca), we have identified that a considerable portion of CPU time is being consumed by payload deserialization.
This issue is particularly problematic for applications like recommendation engines that rely on Qdrant as a vector database, as it results in a substantial increase in CPU requirements for our application.
Steps to Reproduce:
- Enable Qdrant async client for both GRPC and REST endpoints.
- Run the application and perform standard operations.
- Monitor CPU usage and analyze with a profiling tool (e.g., Parca).
Expected Behavior:
- Efficient CPU usage when deserializing payloads.
Actual Behavior:
- High CPU usage is observed, primarily due to payload deserialization.
Environment:
- Qdrant version: [Specify version]
- GRPC and REST async client setup
- Profiling tool: Parca
Proposed Solution:
To mitigate the high CPU usage, we propose utilizing orjson for JSON serialization and deserialization. orjson is known for its performance benefits compared to the standard JSON library in Python. By replacing the current deserialization process with orjson, we anticipate a reduction in CPU overhead.
Additional Context: This CPU overhead is posing a challenge in deploying Qdrant in resource-constrained environments. Any guidance or fixes to address this issue would be greatly appreciated.
hi @GDegrove
just to be sure, could you also provide the version of pydantic you're using?
hi @GDegrove
just to be sure, could you also provide the version of pydantic you're using?
We are using pydantic 2, for exactitude version = "2.8.2"
We'll try to look into it, thanks for pointing it out
Pydantic doesn't use the json package from the standard library for handling de/serialization. Instead, it handles that internally.
Unfortunately, it is not possible to use a custom json tool in pydantic v2 (e.g. https://github.com/pydantic/pydantic/discussions/6388) We've looked into the other places where we could replace the builtin json module with orjson, however, the results are not as promising at the moment (#902) We'll continue looking into the possible performance improvements