qdrant-client icon indicating copy to clipboard operation
qdrant-client copied to clipboard

Huge CPU usage of qdrant client when deserializing payload

Open GDegrove opened this issue 1 year ago • 5 comments

Title: High CPU Usage with Qdrant Async Client for GRPC and REST Endpoints

Description:

We are experiencing significant CPU usage spikes when enabling the Qdrant async client for both GRPC and REST endpoints. By utilizing a profiling tool (Parca), we have identified that a considerable portion of CPU time is being consumed by payload deserialization.

This issue is particularly problematic for applications like recommendation engines that rely on Qdrant as a vector database, as it results in a substantial increase in CPU requirements for our application.

Steps to Reproduce:

  1. Enable Qdrant async client for both GRPC and REST endpoints.
  2. Run the application and perform standard operations.
  3. Monitor CPU usage and analyze with a profiling tool (e.g., Parca).

Expected Behavior:

  • Efficient CPU usage when deserializing payloads.

Actual Behavior:

  • High CPU usage is observed, primarily due to payload deserialization.

Environment:

  • Qdrant version: [Specify version]
  • GRPC and REST async client setup
  • Profiling tool: Parca

Proposed Solution:

To mitigate the high CPU usage, we propose utilizing orjson for JSON serialization and deserialization. orjson is known for its performance benefits compared to the standard JSON library in Python. By replacing the current deserialization process with orjson, we anticipate a reduction in CPU overhead.

Additional Context: This CPU overhead is posing a challenge in deploying Qdrant in resource-constrained environments. Any guidance or fixes to address this issue would be greatly appreciated.

GDegrove avatar Aug 02 '24 10:08 GDegrove

hi @GDegrove

just to be sure, could you also provide the version of pydantic you're using?

joein avatar Aug 02 '24 14:08 joein

hi @GDegrove

just to be sure, could you also provide the version of pydantic you're using?

We are using pydantic 2, for exactitude version = "2.8.2"

GDegrove avatar Aug 02 '24 15:08 GDegrove

We'll try to look into it, thanks for pointing it out

joein avatar Aug 07 '24 14:08 joein

Pydantic doesn't use the json package from the standard library for handling de/serialization. Instead, it handles that internally.

arthur-st avatar Sep 19 '24 06:09 arthur-st

Unfortunately, it is not possible to use a custom json tool in pydantic v2 (e.g. https://github.com/pydantic/pydantic/discussions/6388) We've looked into the other places where we could replace the builtin json module with orjson, however, the results are not as promising at the moment (#902) We'll continue looking into the possible performance improvements

joein avatar Feb 20 '25 13:02 joein