redis-product-search icon indicating copy to clipboard operation
redis-product-search copied to clipboard

User text vector search

Open Spartee opened this issue 2 years ago • 0 comments

Description

In addition to the current full text search capability, we should be able to offer a natual language based vector search that a user can use to find products.

This is already largely implemented, but is turned off due to the fact that performance is not as good as it should be.

Related Code

Backend -> routes.py API route

@r.post("/vectorsearch/text/user",
       response_model=t.List[Product],
       name="product:find_similar_by_user_text",
       operation_id="compute_user_text_similarity")
async def find_products_by_user_text(similarity_request: UserTextSimilarityRequest) -> t.List[Product]:
    q = create_query(similarity_request.search_type,
                    similarity_request.number_of_results,
                    vector_field_name="text_vector",
                    gender=similarity_request.gender,
                    category=similarity_request.category)

    redis_client = await Redis(host=config.REDIS_HOST, port=config.REDIS_PORT, db=0)

    # obtain vector from text model in top level  __init__.py
    vector = TEXT_MODEL.encode(similarity_request.user_text)
    # obtain results of the query
    results = await redis_client.ft().search(q, query_params={"vec_param": vector.tobytes()})

    # Get Product records of those results
    similar_product_pks = [p.product_pk for p in results.docs]
    similar_products = [await Product.get(pk) for pk in similar_product_pks]
    return similar_products

Backend -> Pydantic Schema for API route

class UserTextSimilarityRequest(BaseModel):
    user_text: str
    number_of_results: int = 15
    search_type: str = "KNN"
    gender: str = ""
    category: str = ""

The huggingface model is held as a global variable within the top-level __init__.py. This could probably be improved.

Frontend -> Header.tsx currently commented out.

               <Button
              onClick={() => queryProductsByUserText()}
              variant="outline-success"
              disabled={searchText.length < 1}>
                Vector Search
              </Button>

Frontend JS to call backend -> api.ts

export const getSemanticallySimilarProductsbyText = async (text: string,
                                                    gender="",
                                                    category="",
                                                    search='KNN',
                                                    limit=15,
                                                    skip=0) => {
      let body = {
      user_text: text,
      search_type: search,
      number_of_results: limit,
      gender: gender,
      category: category
      }

    const url = MASTER_URL + "vectorsearch/text/user";
    return fetchFromBackend(url, 'POST', body);
};

TODO

  • [ ] investigate performance of user text vector search - This will largely be in the data prep stage. We currently use the description as the text vector data. This is manually cleaned with regex and probably not optimal.
  • [ ] Once performance is acceptable, enable the vector search in addition to text search
  • [ ] make sure the buttons look right on the front end.

Spartee avatar Aug 08 '22 22:08 Spartee