data-api-builder icon indicating copy to clipboard operation
data-api-builder copied to clipboard

[Enh]: Add `VECTOR_SEARCH()` in filter (MSSQL)

Open JerryNixon opened this issue 5 months ago • 0 comments

NOTE: Please do not start this without talking to Jerry about Vector indexes.

What?

Add VECTOR_SEARCH method to filter.

  • Supported in GraphQL filter
  • Supported in REST $filter

Behavior

  • MSSQL only (this task)
  • One function instance per query
  • Injects a new column _vector_distance
  • Allowed only on base tables (type = table)
  • Result includes all selected columns plus _vector_distance

Why?

Enable semantic search in SQL via vector similarity.

How?

  • [ ] Implement VECTOR_SEARCH in REST $filter
  • [ ] Implement VECTOR_SEARCH in GraphQL filter
  • [ ] Extend SQL generator to emit AI_GENERATE_EMBEDDINGS + VECTOR_DISTANCE
  • [ ] Ensure default ORDER BY _vector_distance DESC unless user specifies
  • [ ] Add _vector_distance column to query results

Function

VECTOR_SEARCH(COLUMN, SIMILAR_TO_STRING, METRIC, TOP_N, EMBEDDINGS_MODEL)

Requires AI_GENERATE_EMBEDDINGS configured for EMBEDDINGS_MODEL

Parameters

  1. COLUMN (string): embeddings column to search
  2. SIMILAR_TO_STRING (string): input text to compare
  3. METRIC (enum): similarity metric (cosine | dot | euclidean)
  4. TOP_N (int): number of matches to return (must be positive)
  5. EMBEDDINGS_MODEL: embeddings model to use with [AI_GENERATE_EMBEDDINGS](https://learn.microsoft.com/en-us/sql/t-sql/functions/ai-generate-embeddings-transact-sql?view=sql-server-ver17)

REST example

GET https://server/api/Reviews?$filter=VECTOR_SEARCH(embedding,'best camera','cosine',3,'text-embedding-3-small')

GraphQL example

query {
  reviews(
    filter: {
      VECTOR_SEARCH: [
        "embedding",
        "best camera",
        "cosine",
        3,
        "Ada2Embeddings"
      ]
    }
  ) {
    items {
      id
      content
      _vector_distance
    }
  }
}

T-SQL equivalent

DECLARE @qv VECTOR(1536) = AI_GENERATE_EMBEDDINGS
(N'best camera' USE MODEL Ada2Embeddings);

SELECT TOP (10) 
    id, 
    VECTOR_DISTANCE('cosine', @qv, [content_vector]) AS distance, 
    title
FROM [dbo].[Reviews]
ORDER BY distance;

JerryNixon avatar Aug 21 '25 03:08 JerryNixon