data-api-builder
data-api-builder copied to clipboard
[Enh]: Add `VECTOR_SEARCH()` in filter (MSSQL)
NOTE: Please do not start this without talking to Jerry about Vector indexes.
What?
Add VECTOR_SEARCH method to filter.
- Supported in GraphQL
filter - Supported in REST
$filter
Behavior
- MSSQL only (this task)
- One function instance per query
- Injects a new column
_vector_distance - Allowed only on base tables (
type = table) - Result includes all selected columns plus
_vector_distance
Why?
Enable semantic search in SQL via vector similarity.
How?
- [ ] Implement
VECTOR_SEARCHin REST$filter - [ ] Implement
VECTOR_SEARCHin GraphQLfilter - [ ] Extend SQL generator to emit
AI_GENERATE_EMBEDDINGS+VECTOR_DISTANCE - [ ] Ensure default ORDER BY
_vector_distance DESCunless user specifies - [ ] Add
_vector_distancecolumn to query results
Function
VECTOR_SEARCH(COLUMN, SIMILAR_TO_STRING, METRIC, TOP_N, EMBEDDINGS_MODEL)
Requires AI_GENERATE_EMBEDDINGS configured for EMBEDDINGS_MODEL
Parameters
- COLUMN (string): embeddings column to search
- SIMILAR_TO_STRING (string): input text to compare
-
METRIC (enum): similarity metric (
cosine | dot | euclidean) - TOP_N (int): number of matches to return (must be positive)
-
EMBEDDINGS_MODEL: embeddings model to use with
[AI_GENERATE_EMBEDDINGS](https://learn.microsoft.com/en-us/sql/t-sql/functions/ai-generate-embeddings-transact-sql?view=sql-server-ver17)
REST example
GET https://server/api/Reviews?$filter=VECTOR_SEARCH(embedding,'best camera','cosine',3,'text-embedding-3-small')
GraphQL example
query {
reviews(
filter: {
VECTOR_SEARCH: [
"embedding",
"best camera",
"cosine",
3,
"Ada2Embeddings"
]
}
) {
items {
id
content
_vector_distance
}
}
}
T-SQL equivalent
DECLARE @qv VECTOR(1536) = AI_GENERATE_EMBEDDINGS
(N'best camera' USE MODEL Ada2Embeddings);
SELECT TOP (10)
id,
VECTOR_DISTANCE('cosine', @qv, [content_vector]) AS distance,
title
FROM [dbo].[Reviews]
ORDER BY distance;