sentry-python icon indicating copy to clipboard operation
sentry-python copied to clipboard

feat(integrations): Add integration for qdrant

Open mxrcooo opened this issue 1 year ago • 0 comments

Adds an integration for Qdrant, supporting both REST and gRPC mode.

Qdrant is a vector-search database. Mentioned here: https://github.com/getsentry/sentry-python/discussions/3007#discussioncomment-9251889

Opening this as draft PR as communicated in the community Discord since there are still a few open questions from my side. Also, there's a lot of work (tests) to be done to finalize this.

  1. The Qdrant service offers two APIs: REST and gRPC. They are almost the same in terms of data sent and regarding Qdrant SDK usage they only differ on the prefer_grpc parameter when creating the (Async)QdrantClient. While the data sent to the server is almost the same, they are still structurally different. Question: I currently handle this by simply using different op arguments (db.qdrant.rest and db.qdrant.grpc respectively) when creating the span. Is that fine or should they be merged? If yes, what about the description? They differ and imo it is not clean to merge them.

  2. The HttpxIntegration captures the REST request caused by a Qdrant SDK call, leading to an almost duplicate span with less information than the one from our integration. QdrantIntegration offers the ability to mute this span which is done by accessing the _span_recorder of the current transaction and removing subsequent span from our current span. This feels very hacky.. is there a better way? Should this be done at all? Same goes for the GRPCIntegration when using Qdrant in gRPC mode.

  3. Any ideas on writing tests for this? Qdrant supports a :memory: option but the monkey patches do not apply in this case, since it doesn't simulate a server but instead does all operations locally (see https://github.com/qdrant/qdrant-client/tree/master/qdrant_client/local). Mocking the responses would work but would be extremely cumbersome as - I think - we'd have to write different mocks for every single endpoint to not break the Qdrant SDK when handling the response. This would also have to be done twice, once for REST and once for gRPC. I guess we could parse their docs and auto-generate mock responses?

mxrcooo avatar Oct 08 '24 00:10 mxrcooo