rdf4j
rdf4j copied to clipboard
GH-635 add Query Evaluation Mode enum to replace separate EvaluationStrategy impls
GitHub issue resolved: #635
Briefly describe the changes proposed in this PR:
- add new QueryEvaluationMode enum as a TransactionSetting
- make evaluation mode configurable on a per-transaction basis
- update repository config templates to allow setting the query evaluation mode
- deprecate old evaluation strategy impls in favor of a single strategy that uses the mode setting to guide behaviour
PR Author Checklist (see the contributor guidelines for more details):
- [x] my pull request is self-contained
- [x] I've added tests for the changes I made
- [x] I've applied code formatting (you can use
mvn process-resourcesto format from the command line) - [x] I've squashed my commits where necessary
- [x] every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change
@hmottestad I am getting test failures due to assertion errors on parent nodes in the algebra tree all over the place. It could be that my changes cause this but at first glance it seems unlikely as I'm not directly touching the algebra tree. What's the status of the develop branch with respect to this at the moment? Can you point me to recent fixes that you did in this space - it's possible that I undid some of your changes when rebasing (this branch does a lot of code refactoring).
https://github.com/eclipse/rdf4j/pull/4103
Also a few fixes in https://github.com/eclipse/rdf4j/pull/4117
I've found the problem - it's the move of that optimizer pipeline to a new package tripping me up again. My code was still using the impl.StandardQueryOptimizerPipeline, which uses the impl optimizer classes, which have not been fixed to deal with node parent references.
We should make the deprecated optimizer classes empty subclasses of the new implementations to avoid this kind of thing - if we can.
Logged issue to fix the deprecation setup: https://github.com/eclipse/rdf4j/issues/4129
@hmottestad when you have a sec could you take a quick look at the failing ShaclTests I have on this PR?
My suspicion is that they fail because they were written for a store that used the StrictEvaluationStrategy, and since in this PR the stores all use StandardEvaluationStrategy (which is basically what used to be called "Extended") the results from some queries are a little different. But I'm having a hard time getting my head around the precise meaning/intent of the tests and what should be adjusted to fix.
select distinct ?target_0000000000 ?value {
?target_0000000000 a <http://example.com/ns#Person> .
?target_0000000000 <http://example.com/ns#date> ?value .
BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_)
FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true))
}
This doesn't return any result anymore.
select distinct ?target_0000000000 ?value { ?target_0000000000 a <http://example.com/ns#Person> . ?target_0000000000 <http://example.com/ns#date> ?value . BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_) FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true)) }This doesn't return any result anymore.
I'm not sure I follow. In minExclusive case 5, when looking at the test data, ?value will be bound to "2018-01-02T21:00:00"^^xsd:dateTime , which means hte comparison operator in the BIND will return false, and therefore the outcome of the COALESCE will be false and the final filter will fail - which means the query result is empty. In other words, isn't it expected that this query returns no result?
select distinct ?target_0000000000 ?value { ?target_0000000000 a <http://example.com/ns#Person> . ?target_0000000000 <http://example.com/ns#date> ?value . BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_) FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true)) }This doesn't return any result anymore.
I'm not sure I follow. In minExclusive case 5, when looking at the test data,
?valuewill be bound to"2018-01-02T21:00:00"^^xsd:dateTime, which means hte comparison operator in theBINDwill returnfalse, and therefore the outcome of theCOALESCEwill befalseand the final filter will fail - which means the query result is empty. In other words, isn't it expected that this query returns no result?
To answer my own question: the difference is probably that we're comparing an xsd:Date and an xsd:DateTime, which in strict evaluation would result in a type error, but in extended/standard evaluation is allowed.
Maybe we should stick with strict mode as the default?
I presume the purpose of this PR is to allow for strict/extended to be configured per transaction. We don't really have to make any changes to the defaults now, do we?
Good point. I do want to switch to non-strict as the default if possible, but perhaps I should split that out and get the framework set up first...
Still to do:
- [x] ensure old-style strategy (factory) config is no longer part of template
- [x] ensure old-style strategy (factory) configs will still work / get converted
- [x] add prompt to set query eval mode in config
- [x] add test coverage for per-transaction/query setting of evaluation mode
Which reminds that I need to double-check that repository config in the workbench still works as exepcted.
- [x] test workbench
Assuming the build succeeds (:crossed_fingers: ) this PR is now ready for final review. I'd appreciate your thoughts, peeps.