dsp-api
dsp-api copied to clipboard
Slow query when searching for list nodes
When searching for a list node that has (many) subnodes, the query is very slow both in v1 and v2.
Reproducible with beol data. Example: Search for a letter with a topic "Mathematics" or a letter with a topic "Professional Activity"
I think the property path syntax to find the subnodes is the problem
v1: https://app1.dasch.swiss/ -> https://api.dasch.swiss/v1/search/?searchtype=extended&property_id=http%3A%2F%2Fwww.knora.org%2Fontology%2F0801%2Fbeol%23hasSubject&compop=EQ&searchval=http%3A%2F%2Frdfh.ch%2Flists%2F0801%2Fprofessional_activity&show_nrows=25&start_at=0&filter_by_restype=http%3A%2F%2Fwww.knora.org%2Fontology%2F0801%2Fbeol%23basicLetter
v2: https://beol.dasch.swiss ->
PREFIX knora-api: <http://api.knora.org/ontology/knora-api/v2#>
CONSTRUCT {
?mainRes knora-api:isMainResource true .
?mainRes <http://api.dasch.swiss/ontology/0801/beol/v2#hasSubject> ?propVal0 .
} WHERE {
?mainRes a knora-api:Resource .
?mainRes a <http://api.dasch.swiss/ontology/0801/beol/v2#letter> .
?mainRes <http://api.dasch.swiss/ontology/0801/beol/v2#hasSubject> ?propVal0 .
?propVal0 <http://api.knora.org/ontology/knora-api/v2#listValueAsListNode> <http://rdfh.ch/lists/0801/mathematics>
}
OFFSET 0
This query takes 10 seconds on my machine:
SELECT DISTINCT (COUNT(DISTINCT ?mainRes) AS ?count)
WHERE {
?mainRes <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/knora-base#Resource> .
GRAPH <http://www.ontotext.com/explicit> {
?mainRes <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?mainRes <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.knora.org/ontology/0801/beol#letter> .
?mainRes <http://www.knora.org/ontology/0801/beol#hasSubject> ?propVal0 .
GRAPH <http://www.ontotext.com/explicit> {
?propVal0 <http://www.knora.org/ontology/knora-base#isDeleted> "false"^^<http://www.w3.org/2001/XMLSchema#boolean> .
}
?propVal0 <http://www.knora.org/ontology/knora-base#valueHasListNode> ?propVal0__httpapiknoraorgontologyknoraapiv2listValueAsListNode__httprdfhchlists0801mathematics__listNodeVar .
<http://rdfh.ch/lists/0801/mathematics> <http://www.knora.org/ontology/knora-base#hasSubListNode>* ?propVal0__httpapiknoraorgontologyknoraapiv2listValueAsListNode__httprdfhchlists0801mathematics__listNodeVar .
}
LIMIT 1
Could this be a problem based on our implementation of (hierarchical) lists?? In the old Knora I used to add a left-right index which made it very simple to search for subnodes. The numeric id had to >= left and <= right. See Hlist.php in old salsah. I used the "nested set" paradigm (see https://en.wikipedia.org/wiki/Nested_set_model)
@benjamingeer proposes:
The idea is to create a base property that is only inferred, and a subproperty that is used explicitly. Then you make the base property an owl:TransitiveProperty.
So you could do something like this:
:hasDescendantListNode rdf:type owl:ObjectProperty, owl:TransitiveProperty ; rdfs:subPropertyOf :objectCannotBeMarkedAsDeleted ; :objectClassConstraint :ListNode ; :subjectClassConstraint :ListNode .
:hasSubListNode rdf:type owl:ObjectProperty ; rdfs:subPropertyOf :objectCannotBeMarkedAsDeleted, :hasDescendantListNode ; :objectClassConstraint :ListNode ; :subjectClassConstraint :ListNode .
You can see the definition of owl:TransitiveProperty in KnoraRules.pie:
Id: prp_trp p rdf:type owl:TransitiveProperty x p y y p z ------------------------------- x p z
Then, for example, if you have list nodes connected like this:
http://rdfh.ch/lists/0001/InterestingStuff knora-base:hasSubListNode http://rdfh.ch/lists/0001/VirtualReality .
http://rdfh.ch/lists/0001/VirtualReality knora-base:hasSubListNode http://rdfh.ch/lists/0001/VRNetworkVisualisation .
The triplestore will infer:
http://rdfh.ch/lists/0001/InterestingStuff knora-base:hasDescendantListNode http://rdfh.ch/lists/0001/VRNetworkVisualisation .
Then in Gravsearch, to find out if a list node is a descendant of http://rdfh.ch/lists/0001/InterestingStuff:
?listValue knora-api:listValueAsListNode ?listNode . http://rdfh.ch/lists/0001/InterestingStuff knora-api:hasDescendantListNode ?listNode .
If ?listNode is http://rdfh.ch/lists/0001/VRNetworkVisualisation, the query should return a result.
in addition to the inference of a node matching the one we look for (might it be the node itself or its sub-list nodes), I think that an exact match would come handy (when we want that exact node and not one of it sub-list nodes either because we don't want sub-list nodes or because we already know there is none).