comunica
comunica copied to clipboard
Query with multiple `COUNT` clauses returns incorrect result depending on order in projection
Issue type:
- :bug: Bug
Description:
Given the following input
@prefix ex: <http://example.org/> .
<http://foo.org/id/graph/foo> {
<http://foo.org/id/object/foo>
ex:foo ex:Foo ;
ex:bar [] ;
ex:baz "baz1", "baz2", "baz3" .
}
running this query:
SELECT (COUNT(DISTINCT ?s) AS ?subjects) (COUNT(DISTINCT ?o) as ?objects) ?p
FROM <http://foo.org/id/graph/foo>
{
?s ?p ?o .
}
GROUP BY ?p
results in
while running this query (object count switched with subject count in projection):
SELECT (COUNT(DISTINCT ?o) as ?objects) (COUNT(DISTINCT ?s) AS ?subjects) ?p
FROM <http://foo.org/id/graph/foo>
{
?s ?p ?o .
}
GROUP BY ?p
results in the expected
Environment:
software | version |
---|---|
Comunica Engine | 2.10.2 |
node | v21.6.1 |
npm | 10.2.4 |
yarn | Yarn is unavailable |
Operating System | linux (Linux 5.15.133.1-microsoft-standard-WSL2) |
NOTE I'm using npx comunica-sparql-file-http
on the input file above.
I've tried this on several other SPARQL implementations and these do not show this behavior.
Crash log:
none
Thanks for reporting!
As discussed with @rubensworks I will work on this issue.
@jitsedesmet I was told to ping you for my questions related to the expressions code.
One thing that stuck out to me in the case where counts are incorrect is that startTerm (code) equals the subject of the current binding, rather than the predicate which we are supposed to be counting.
My question is what does startTerm mean?
On the top of my head: start term is basically the value you'd return when put was never called.
So for the count aggregator, this would be 0
, but the sample aggregator does not have a start term (a sample of nothing is simply an error (irc)).
I hope that helps. (I did not really read the issue, so if it didn't, just ping me again :) )