Halyard icon indicating copy to clipboard operation
Halyard copied to clipboard

Poor performance of nested OPTIONALs

Open jindrichmynarz opened this issue 6 years ago • 1 comments

When you have a SPARQL query with nested OPTIONAL clauses, such as the following, it's performance is poor, typically causing timeouts.

PREFIX bibo:    <http://purl.org/ontology/bibo/>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT *
WHERE {
  {
    SELECT ?article
    WHERE {
      ?article a bibo:Article .
    }
    LIMIT 10
  }

  OPTIONAL {
    OPTIONAL {
      ?article dcterms:issued ?article_issued .
    }
  }
}

Output of Halyard Profile for this query:

Optimized query:
    Projection [2,955,991,897,878,706.5]
        ProjectionElemList
            ProjectionElem "article"
            ProjectionElem "article_issued"
        LeftJoin [2,955,991,897,878,706.5]
            Slice ( limit=10 ) [3,614,563.841]
                Projection [3,614,563.841]
                    ProjectionElemList
                        ProjectionElem "article"
                    StatementPattern [3,614,563.841]
                        Var (name=article)
                        Var (name=_const_f5e5585a_uri, value=http://www.w3.org/1999/02/22-rdf-syntax-ns#type, anonymous)
                        Var (name=_const_6dd7acd3_uri, value=http://purl.org/ontology/bibo/Article, anonymous)
            LeftJoin [226.251]
                SingletonSet [1]
                StatementPattern [226.251]
                    Var (name=article)
                    Var (name=_const_884f353b_uri, value=http://purl.org/dc/terms/issued, anonymous)
                    Var (name=article_issued)

The nested OPTIONAL in this query is unnecessary, but it allows to replicate the issue without in a minimal way.

jindrichmynarz avatar Sep 05 '19 13:09 jindrichmynarz

Mapping nested optional to LeftJoin with SingletonSet is correct and it should not cause any issue. I see minor issue with cardinality of a sub-select with Slice, however it does not affect final query tree. I'm aware of some specific queries causing performance issues, however unfortunately it is not as simple as just nested OPTIONAL. It requires further investigation.

asotona avatar Sep 09 '19 20:09 asotona