rdf4j GH-635 add Query Evaluation Mode enum to replace separate EvaluationStrategy impls

GitHub issue resolved: #635

Briefly describe the changes proposed in this PR:

add new QueryEvaluationMode enum as a TransactionSetting
make evaluation mode configurable on a per-transaction basis
update repository config templates to allow setting the query evaluation mode
deprecate old evaluation strategy impls in favor of a single strategy that uses the mode setting to guide behaviour

PR Author Checklist (see the contributor guidelines for more details):

[x] my pull request is self-contained
[x] I've added tests for the changes I made
[x] I've applied code formatting (you can use mvn process-resources to format from the command line)
[x] I've squashed my commits where necessary
[x] every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

Jun 06 '22 03:06 abrokenjester

@hmottestad I am getting test failures due to assertion errors on parent nodes in the algebra tree all over the place. It could be that my changes cause this but at first glance it seems unlikely as I'm not directly touching the algebra tree. What's the status of the develop branch with respect to this at the moment? Can you point me to recent fixes that you did in this space - it's possible that I undid some of your changes when rebasing (this branch does a lot of code refactoring).

Aug 27 '22 02:08 abrokenjester

https://github.com/eclipse/rdf4j/pull/4103

Also a few fixes in https://github.com/eclipse/rdf4j/pull/4117

Aug 27 '22 05:08 hmottestad

I've found the problem - it's the move of that optimizer pipeline to a new package tripping me up again. My code was still using the impl.StandardQueryOptimizerPipeline, which uses the impl optimizer classes, which have not been fixed to deal with node parent references.

We should make the deprecated optimizer classes empty subclasses of the new implementations to avoid this kind of thing - if we can.

Aug 27 '22 23:08 abrokenjester

Logged issue to fix the deprecation setup: https://github.com/eclipse/rdf4j/issues/4129

Aug 27 '22 23:08 abrokenjester

@hmottestad when you have a sec could you take a quick look at the failing ShaclTests I have on this PR?

My suspicion is that they fail because they were written for a store that used the StrictEvaluationStrategy, and since in this PR the stores all use StandardEvaluationStrategy (which is basically what used to be called "Extended") the results from some queries are a little different. But I'm having a hard time getting my head around the precise meaning/intent of the tests and what should be adjusted to fix.

Sep 17 '22 06:09 abrokenjester

select distinct ?target_0000000000 ?value {
	?target_0000000000 a <http://example.com/ns#Person> .
	?target_0000000000 <http://example.com/ns#date> ?value .
	BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_)
	FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true))
}

This doesn't return any result anymore.

Sep 17 '22 07:09 hmottestad

select distinct ?target_0000000000 ?value {
	?target_0000000000 a <http://example.com/ns#Person> .
	?target_0000000000 <http://example.com/ns#date> ?value .
	BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_)
	FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true))
}

This doesn't return any result anymore.

I'm not sure I follow. In minExclusive case 5, when looking at the test data, ?value will be bound to "2018-01-02T21:00:00"^^xsd:dateTime , which means hte comparison operator in the BIND will return false, and therefore the outcome of the COALESCE will be false and the final filter will fail - which means the query result is empty. In other words, isn't it expected that this query returns no result?

Sep 18 '22 03:09 abrokenjester

select distinct ?target_0000000000 ?value {
	?target_0000000000 a <http://example.com/ns#Person> .
	?target_0000000000 <http://example.com/ns#date> ?value .
	BIND(("2018-01-01"^^<http://www.w3.org/2001/XMLSchema#date> >= ?value) as ?9354941586934ef58838394e53eb08e7_0_)
	FILTER(COALESCE(?9354941586934ef58838394e53eb08e7_0_, true))
}
This doesn't return any result anymore.
I'm not sure I follow. In minExclusive case 5, when looking at the test data, ?value will be bound to "2018-01-02T21:00:00"^^xsd:dateTime , which means hte comparison operator in the BIND will return false, and therefore the outcome of the COALESCE will be false and the final filter will fail - which means the query result is empty. In other words, isn't it expected that this query returns no result?

To answer my own question: the difference is probably that we're comparing an xsd:Date and an xsd:DateTime, which in strict evaluation would result in a type error, but in extended/standard evaluation is allowed.

Sep 18 '22 03:09 abrokenjester

Maybe we should stick with strict mode as the default?

Sep 18 '22 04:09 hmottestad

I presume the purpose of this PR is to allow for strict/extended to be configured per transaction. We don't really have to make any changes to the defaults now, do we?

Sep 18 '22 04:09 hmottestad

Good point. I do want to switch to non-strict as the default if possible, but perhaps I should split that out and get the framework set up first...

Sep 18 '22 06:09 abrokenjester

Still to do:

[x] ensure old-style strategy (factory) config is no longer part of template
[x] ensure old-style strategy (factory) configs will still work / get converted
[x] add prompt to set query eval mode in config
[x] add test coverage for per-transaction/query setting of evaluation mode

Sep 19 '22 10:09 abrokenjester

Which reminds that I need to double-check that repository config in the workbench still works as exepcted.

[x] test workbench

Sep 24 '22 02:09 abrokenjester

Assuming the build succeeds (:crossed_fingers: ) this PR is now ready for final review. I'd appreciate your thoughts, peeps.

Oct 29 '22 01:10 abrokenjester

rdf4j rdf4j copied to clipboard

GH-635 add Query Evaluation Mode enum to replace separate EvaluationStrategy impls

rdf4j
rdf4j copied to clipboard