hibernate-search
hibernate-search copied to clipboard
HSEARCH-4601 Syntactic sugar for OR/AND predicates
https://hibernate.atlassian.net/browse/HSEARCH-4601
Hi!
So, I went back to this PR and I hope I got it right this time after my initial botching.
I have added support for basic n-ary or(...) / and(...) constructs à la
f.or( f.match().field("foo").matching("val1"),
f.match().field("bar").matching("val2") )
f.and( f.match().field("foo").matching("val1"),
f.match().field("bar").matching("val2") )
I haven't yet added builder-style constructs with .add(...) and with(...).
I guess that should be done.
I haven't updated the documentation yet either.
Before I take the next step, is this going in the right direction? ;-)
I have added new tests in BoolPredicateSpecificsIT, as they are more or less cloned from bool() ones.
This makes the file name not very relevant...
Shall I add a new IT file of its own, at the risk of more duplicated code?
If not, shall "duplicated" methods be grouped further, e.g. in @Nested test classes?
Also, aren't the tests in SmokeIT a bit redundant/pointless? Maybe they should not be added there...
So far, AndPredicateClausesStep and OrPredicateClausesStep don't implement PredicateScoreStep, which would allow calling boost(...) and constantScore() on the overall filter.
Shouldn't they?
Similarly, so far, OrPredicateClausesStep doesn't support any of the minimumShouldMatch constraints. Shall it?
That looks to me a bit far away from the expected semantic of an OR (with minimumShouldMatch == 1), so maybe such a feature should be reserved to more complex bool()-based predicates.
(If nothing should be added, the Javadoc "@return The initial step of a DSL" should be revised.)
Shall something be done to support mustNot?
Again, I'd leave that to bool()...
Shall SearchPredicateFactory.and(...) and or(...) be tagged as @Incubating at first?
As usual, using varargs with a parameterized type requires a lot of @SuppresssWarnings("unchecked"), including in calling code... :'-(
Would you happen to have anything better to suggest?
(Besides n overloads for all possible arities. ;-))
Best regards, Yannis
Hi !
I guess we agree on everything: thanks for the constructive and professional comments!
I'm a bit sad for the and( Function<...>... ) (and or) overloads, but of course I can't judge [yet?] how not that relevant they are.
DONE:
- Moved new tests in dedicated
AndPredicateSpecificsITandOrPredicateSpecificsIT, instead of "polluting"BoolPredicateSpecificsIT. Could you please confirm that duplicated code (around test data initialization) is accepted with all its consequences? Don't we want to share something somewhere (with, conversely, the consequences of stronger coupling)? - Removed the tests added to SmokeIT, as this file has been deleted in another PR.
- Corrected non-updated copy/paste in Javadoc of
SearchPredicateFactory.
TODO:
- [ ] 2 code smells reported by Sonar:
Sis indeed not used inAndPredicateClausesStepandOrPredicateClausesStep. To be removed if chaining / aself()method with covariant return type is indeed unnecessary (but I guess it will be necessary). - [ ]
AndPredicateClausesStepandOrPredicateClausesStepshould implementPredicateScoreStepin order to supportboost(...)andconstantScore(). - [ ] Rename
AndPredicateClausesStepto something likeAndPredicateOptionsStep(seeExistsPredicateOptionsStepandMatchPredicateOptionsStep). - [ ] Ditto for
OrPredicateClausesStep. - [ ] Remove the overload
AbstractSearchPredicateFactory.and( Function<...>... )+or( Function<...>... ). - [ ] Clean up copy-pasted code that ended up being unnecessary in
AndPredicateSpecificsITandOrPredicateSpecificsIT.
Carried-over TODOs:
- [ ] Add builder-style constructs with
.add(...)andwith(...). - [ ] Documentation
Could you please confirm that duplicated code (around test data initialization) is accepted with all its consequences? Don't we want to share something somewhere (with, conversely, the consequences of stronger coupling)?
I sense you're probing for how well we apply best practices, so I'll detail a bit.
We avoid duplicating code.
In the main code, the only parts where we do duplicate code is APIs, when we have to (multiple Maven modules, each with its own API but with subtle differences).
In tests, and actually made a great effort to avoid code duplication. Just have a look at MatchPredicateBaseIT, RangePredicateBaseIT, etc. A great many tests are shared across multiple predicates, but as you can see, it becomes much harder to understand what's going on in a particular test. For instance, you have to read through three different classes to get a complete view of what happens exactly in org.hibernate.search.integrationtest.backend.tck.search.predicate.MatchPredicateBaseIT.AnalysisIT#multiIndex_incompatibleAnalyzer.
So, really, as with most things, it's a trade-off. Avoiding duplication is fine but if it turns the code into a write-only, overly complex mess that external contributors cannot understand anymore (and frankly we're getting close...)... then it's not worth it.
I also generally accept higher complexity for the main code than for tests, because, well, can you trust tests that are more complex than what you're testing?
If you can find a way to "factorize" the duplicated code, have one test class for bool, one for or, and one for and, while keeping the code clear... then sure, I'm all for it. I just don't think it will be easy.
2 code smells reported by Sonar: S is indeed not used in
AndPredicateClausesStepandOrPredicateClausesStep. To be removed if chaining / a self() method with covariant return type is indeed unnecessary (but I guess it will be necessary).
This is about backwards compatibility.
Indeed, we don't need the S parameter right now. But if we add methods to AndPredicateClausesStep and OrPredicateClausesStep, then we might need it. And adding the S parameter in a later version may break backwards compatibility for our users: if their code refers to AndPredicateClausesStep explicitly (ideally they shouldn't have to, but well, shit happens) at best they'll get raw type warnings, at worst their code won't compile anymore.
Anyway, once you add support for constantScore/boost by extending PredicateScoreStep, you'll see the S parameter becomes necessary, so you won't have this problem anymore :)
Regarding the code, I'll do another review when it's complete, unless you have specific questions ;)
Thanks for the clarifications! I guess we agree that [non-]duplicated code is always a trade-off, and I do agree on the reasoning that tests should be kept as simple as possible in order to avoid complexity-induced side-effects that may hamper their initial purpose.
Speaking about duplicated code (and the corresponing trade-offs), implementations of PredicateScoreStep are very similar in ExistsPredicateFieldStepImpl, CommonState, MatchIdPredicateMatchingStepImpl, AbstractBooleanPredicateClausesStep, and now in AndPredicateOptionsStepImpl and OrPredicateOptionsStepImpl...
I'm wondering whether it could be relevant to introduce some intermediate abstract class below AbstractPredicateFinalStep in order to factorize such concerns. That would probably imply to store the builder at that level (=> getter or... protected final member [non-]variable?), to add yet another parameterized type for it and to generalize a covariant self() method...
(But I'll consider this as beyond the scope of this PR.)
(+1 also for the S parameterized type: that's why I kept it!)
DONE:
- Renamed
AndPredicateClausesSteptoAndPredicateOptionsStepandOrPredicateClausesSteptoOrPredicateOptionsStep. AndPredicateOptionsStepandOrPredicateOptionsStepnow implementPredicateScoreStep. => Cloned and adaptedBoolPredicateBaseITtoAndPredicateBaseITandOrPredicateBaseIT.
Carried-over TODOs:
- [ ] Add builder-style constructs with
.add(...)andwith(...). - [ ] Remove the overload
AbstractSearchPredicateFactory.and( Function<...>... )+or( Function<...>... ). - [ ] Clean up copy-pasted code that ended up being unnecessary in
AndPredicateSpecificsITandOrPredicateSpecificsIT. - [ ] Document the new feature.
I'm wondering whether it could be relevant to introduce some intermediate abstract class below
AbstractPredicateFinalStepin order to factorize such concerns. That would probably imply to store thebuilderat that level (=> getter or...protected finalmember [non-]variable?), to add yet another parameterized type for it and to generalize a covariantself()method...
That's a lot of effort for two (very) simple methods, but sure, feel free to create a ticket (a Task). It's probably going to be non-trivial, though; see org.hibernate.search.engine.search.predicate.dsl.impl.AbstractBooleanMultiFieldPredicateCommonState#applyBoostAndConstantScore.
And yes, renaming thisAsS to self across the board would be a good idea as well.
Also:
to add yet another parameterized type for it
I don't think that's necessary, but we shall see.
DONE:
- Added builder-style constructs with
.add(...)andwith(...).add(...)supports bothPredicateFinalStepandSearchPredicate(in order to support previously-built predicates): did that make sense? - Removed the
AbstractSearchPredicateFactory.and( Function<...>... )+or( Function<...>... )overloads.
Carried-over TODOs:
- [ ] Document the new feature.
TODO :
- [ ] In
add(...)methods, add support forFunction<? super SearchPredicateFactory, ? extends PredicateFinalStep>. - [ ] Make so that using
add(...)andwith()is not "allowed" from theand(...)/or(...)"constructors" that already allow to specify some predicates. (Cf. https://github.com/hibernate/hibernate-search/pull/3152#discussion_r934413732.) - [ ] Prevent support of constructs à la
f.and().with( c -> c.toPredicate() ).add( ... ). (Cf. https://github.com/hibernate/hibernate-search/pull/3152#discussion_r934420835.)
Carried-over TODOs:
- [ ] Document the new feature.
DONE :
- In
add(...)methods, add support forFunction<? super SearchPredicateFactory, ? extends PredicateFinalStep>. - Merged the two class hierarchies for
andandorin a single one, as only one method call actually changes and interface/classes names don't seem very relevant (+ support formustNotis coming)... - Reworked the class hierarchy so that:
add(...)andwith()are no longer accessible from theand(...)/or(...)"constructors" that allow to specify some predicates, but only from the parameterless "constructors".- In consumers,
toPredicate()is no longer accessible. - As such, the following code doesn't compile:

FTR, the class hierarchy now looks like this (I wish I had IntelliJ Ultimate...):
Carried-over TODOs:
- [ ] Document the new predicates.
- support for
mustNotis coming
I'd say that PR is already more than enough work, let's not bother right now ;)
That's a separate ticket anyway: https://hibernate.atlassian.net/browse/HSEARCH-4645
FTR, the build failure was due to an attempt to address an IntelliJ warning, which in turn triggered a Javadoc warning... ;-) Will undo!
DONE:
- Renamed everything to ~
*SimpleBooleanOperatorPredicate*.
Carried-over TODOs:
- [ ] Document the new predicates.
DONE:
- First iteration on the documentation:
- Inserted an "
and: combine predicates as a conjunction" and an "or: combine predicates as a disjunction" section before the initial "bool: combine predicates (or/and/...)" section, renamed as "bool: advanced combinations of predicates (or/and/…)". (Intros sound a bit verbose to me: WDYT?) - Replaced the examples for pure
andandorpredicates with "[...] using the simplerand/orsyntax is recommended" and corresponding references.
- Inserted an "
TODO:
- [ ] Links to the initial
boolsection should be augmented with links toandandoras appropriate. - [ ] Some examples that only use AND/OR pattern should be simplified with the simpler syntax: OK?
- [ ] There should be a section "Adding clauses dynamically with the lambda syntax" for the
andpredicate (and a reference to it in order to avoid repetitions in theorsection) in order to introduce theaddandwithsyntaxes. However, I'm wondering whether we'd need specific "overloads" ofSearchQueryWhereStep.where(...)for that purpose, that clearly state if we want anandor anorhere... WDYT?
Any other comments?
Here is the generated PDF for easier review: hibernate_search_reference.pdf.
DONE:
- Addressed review comments.
- In the new
andandorsections, added "Adding clauses dynamically with the lambda syntax" subsections. Note that the "cascadedwith" syntax is only explained in theorsection, as the point is mostly to combine two different operators (=> these two different operators must have been introduced first). - Simplified the examples that used pure
bool()-based AND/OR (see attached PDF for easier review):- Example 65. Searching with a flattened structure
- Example 146. Declaring a named predicate
- Skipped Example 243. Using implicit nesting as it would be confusing to switch to the simplified syntax as this example is a variation of "Example 242. Matching multiple predicates against a single nested object".
- Example 258. Boosting on a per-predicate basis
- Example 260. Making the score of a predicate constant
- Example 342. Targeting a field using relative paths
Here is the generated PDF for easier review: hibernate_search_reference.pdf.
And now, by reading the examples with lambdas, i.e. when the operator instantiation and the add operations are distant from one another, I'm wondering if such syntax isn't requiring an excessive cognitive load for casual readers to figure out what kind of operation this add is performing. I'd say the cognitive load would be reduced if the methods were to be more explicitly named and and or. For the forthcoming not predicate (HSEARCH-4645), renaming add to something like andNot may also be more explicit.
In addition to making code more self-explanatory, having specific "add" methods would be less error-prone when different predicates are used in different with(...) constructs, as in "Example 228. Easily adding clauses dynamically using with(…) and the lambda syntax".
Note that doing so would totally mess the current class hierarchy, which is quite operator-agnostic as it is... ;-)
Thanks for your pull request!
This pull request appears to follow the contribution rules.
› This message was automatically generated.
I rebased, applied my suggestions and force-pushed to your branch. Let's see what CI has to say...







