spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Add 'like' operator in FilterExpressionBuilder

Open rsandx opened this issue 6 months ago • 2 comments

Expected Behavior

The document of FilterExpressionBuilder says "This builder DSL mimics the common https://www.baeldung.com/hibernate-criteria-queries syntax." I'd expect all common hibernate operators supported, but the 'like' operator is not supported here.

Current Behavior

No 'like' operator so you can't filter by a metadata containing a certain string.

Context

Many times the similarity search can't return the desired results as the algorithm does not work like the traditional deterministic comparison, in that case we could combine the traditional search against metadata to improve the results. An important and very useful operator to perform the traditional search is 'like', please implement it in FilterExpressionBuilder and related classes.

rsandx avatar Feb 13 '24 23:02 rsandx

The SpringAI VectorStore filter-expression syntax is based on the common SQL filter expressions grammar.

Our aim is to support filter expressions that can been run and ported across the various Vector Stores implementations.

The different stores though support different subsets of filter operators. For some deficiencies, such as missing NOT or IN/NIN we have provided logical transformations that can convert such expressions into semantically equal expressions using different operators. For example the A IN [x, y, z] can be transformed into A == x || A == y || A == z. Hope you get the idea.

Unfortunately the LIKE operator seems to be supported by a limited set of Vector Stores and there is no an obvious workarounds to compensate this deficiencies.

If you have ideas or want to contribute in this space you are more than welcome.

tzolov avatar Feb 14 '24 13:02 tzolov

@tzolov, thanks for looking into this issue quickly. I understand that LIKE operator is more complicated than other simple comparison operators, especially if your goal is to map directly to the syntax supported by the target vector stores. But as you said you have provided logical transformations for some operators such as NOT or IN/NIN, I wonder if you could do that for LIKE. Here is a post that gives some ideas to implement a SQL like 'LIKE' operator in java. I particularly like the following comment:

"You could turn '%string%' to contains(), 'string%' to startsWith() and '%string"' to endsWith().

You should also run toLowerCase() on both the string and pattern as LIKE is case-insenstive.

Not sure how you'd handle '%string%other%' except with a Regular Expression though."

So worst case the LIKE logic could be implemented with some post-processing, although inefficient, but you may find an efficient way to do that. Hope this helps.

rsandx avatar Feb 14 '24 14:02 rsandx