virtuoso-opensource icon indicating copy to clipboard operation
virtuoso-opensource copied to clipboard

Wrong behaviour of SPARQL DISTINCT

Open galgonek opened this issue 4 years ago • 1 comments

While working on a study comparing different approaches to query the neXtProt dataset, I have observed that Virtuoso returns wrong results in some cases.

If I create the table —

create table nextprot.entry_bases
(
    id     varchar not null,
    primary key(id)
);

insert into nextprot.entry_bases values ('NX_Q15365');

— and define the mapping —

xml_set_ns_decl('', 'http://nextprot.org/rdf#', 2);
xml_set_ns_decl('iri', 'http://bioinfo.iocb.cz/rdf/quad-storage/linked-data-view/iri-class/nextprot#', 2);

sparql create iri class iri:entry "http://nextprot.org/rdf/entry/%U"(in id varchar not null) option (bijection).;

sparql create quad storage virtrdf:NeXtProtQuadStorage
    from DB.nextprot.entry_bases as entry_bases
{
  create virtrdf:nextprot as graph iri ("http://nextprot.org/rdf")
  {
    iri:entry(entry_bases.id)
      rdf:type :Entry.
  }
};

— then the SPARQL query —

sparql
define input:storage virtrdf:NeXtProtQuadStorage
select distinct ?entry where {
  ?entry rdf:type :Entry.
};

— does not return a full IRI, but only fragment 'NX_Q15365' is returned.

As a workaround, it is possible to use the following query —

sparql
define input:storage virtrdf:NeXtProtQuadStorage
select * where {{
  select distinct ?entry where {
    ?entry rdf:type :Entry.
  }
}};

— that returns http://nextprot.org/rdf/entry/NX_Q15365 as expected.

galgonek avatar Jul 27 '21 12:07 galgonek

Our development team will review this issue and report back as soon as posible.

openlink avatar Jul 27 '21 12:07 openlink