camel icon indicating copy to clipboard operation
camel copied to clipboard

[Feature Request] Add search to BaseVectorStorage and all classes implemented it

Open AveryYay opened this issue 9 months ago • 4 comments

Required prerequisites

  • [x] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
  • [ ] Consider asking first in a Discussion.

Motivation

To enable users perform searches (with some filters) within the vector storage

refer: https://github.com/camel-ai/camel/pull/1941#discussion_r2008802053

Solution

No response

Alternatives

No response

Additional context

No response

AveryYay avatar Mar 24 '25 04:03 AveryYay

Hey @AveryYay , just a suggestion:It seems unnecessary to add a separate search method to BaseVectorStorage. Instead, we can simply enhance the existing query method by adding an optional filter_conditions: Optional[Dict[str, Any]] = None parameter. Implementations can then import and construct filter expressions as needed (e.g., for Milvus or Qdrant).Something like the following:

    def query(
        self,
        query: VectorDBQuery,
        filter_conditions: Optional[Dict[str, Any]] = None,
        **kwargs: Any,
    ) -> List[VectorDBQueryResult]:
        from qdrant_client.http.models import (
            Condition,
            FieldCondition,
            Filter,
            MatchValue,
        )

        # Construct filter if filter_conditions is provided
        search_filter = None
        if filter_conditions:
            must_conditions = [
                FieldCondition(key=key, match=MatchValue(value=value))
                for key, value in filter_conditions.items()
            ]
            search_filter = Filter(must=cast(List[Condition], must_conditions))

        # Execute the search with optional filter
        search_result = self._client.query_points(
            collection_name=self.collection_name,
            query=query.query_vector,
            with_payload=True,
            with_vectors=True,
            limit=query.top_k,
            query_filter=search_filter,
            **kwargs,
        )

        query_results = [
            VectorDBQueryResult.create(
                similarity=point.score,
                id=str(point.id),
                payload=point.payload,
                vector=point.vector,  # type: ignore[arg-type]
            )
            for point in search_result.points
        ]

        return query_results

subway-jack avatar Mar 26 '25 04:03 subway-jack

Hi @subway-jack , thank you for the suggestion! However, I think for the better performance and cleaner code, we might need to separate these functionalities (we can call search in query) as each function should serve its own job in my opinion and search (purely filter) is different from query. Adding a filter to query is definitely doable, I'm just thinking it'd better to separate these functionalities.

AveryYay avatar Mar 26 '25 06:03 AveryYay

Oh I see, so the search you are talking about refers to pure filtering without content matching

subway-jack avatar Mar 26 '25 06:03 subway-jack

Yes!

AveryYay avatar Mar 27 '25 14:03 AveryYay