plsql-cop-cli icon indicating copy to clipboard operation
plsql-cop-cli copied to clipboard

Parser support for shorthand operators (vector functionality)

Open rolandstirnimann opened this issue 1 year ago • 1 comments

The parser should support the shorthand operators, introduced with Oracle 23.4 (vector functionality). Currently, the code below leads to parse errors. As a workaround, the classical syntax can be used.

SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <-> :query_vector -- shorthand for EUCLIDEAN
 FETCH FIRST 4 ROWS ONLY;
 
SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <=> :query_vector -- shorthand for COSINE
 FETCH FIRST 4 ROWS ONLY;

SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <#> :query_vector -- SHORTHAND FOR DOT
 FETCH FIRST 4 ROWS ONLY; 

rolandstirnimann avatar May 28 '24 04:05 rolandstirnimann

Yes, these shorthand operators for distances were introduced in 23.4 and documented in May 2024. Therefore I consider this an enhancement request and not a bug.

Here are the examples using the "classical" syntax (functions instead of operators) that do not cause parse errors in version 5.0.1:

-- alternative for <->
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, euclidean)
fetch first 4 rows only;

-- alternative for <=>
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, cosine)
fetch first 4 rows only;

-- alternative for <#> operator
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, dot)
fetch first 4 rows only;

PhilippSalvisberg avatar May 28 '24 05:05 PhilippSalvisberg