problem with graphs
Hi, after I loaded some RDF data into new db and run virtuoso server, the number of graphs in the db was increasing unintentionally.
In a initial state, I got the following graphs through the SPARQL endpoint available at http://data.allie.dbcls.jp/sparql .
"http://purl.org/allie" 163457277
"http://data.allie.dbcls.jp/DAV/" 4667
"http://www.openlinksw.com/schemas/virtrdf#" 2475
"http://www.openlinksw.com/schemas/oplweb#" 1791
"virtrdf-label" 644
"http://purl.org/allie/ontology/201108#" 173
"http://www.w3.org/2002/07/owl#" 160
"http://purl.org/allie/void" 44
"http://data.allie.dbcls.jp/sparql" 33
"facets" 32
"b3sifp" 6
"b3sonto" 5
"http://www.w3.org/ns/ldp#" 3
"urn:rules.skos" 2
After one or two weeks later, I got the following graphs which included unintended four graphs.
"http://purl.org/allie" 163457277
"http://data.allie.dbcls.jp/DAV/" 4667
"http://www.openlinksw.com/schemas/virtrdf#" 2475
"http://www.openlinksw.com/schemas/oplweb#" 1791
"virtrdf-label" 644
"http://purl.org/allie/ontology/201108#" 173
"http://www.w3.org/2002/07/owl#" 160
"http://purl.org/allie/void" 44
"http://data.allie.dbcls.jp/sparql" 33
"facets" 32
"http://purl.org/allie/id/pair/2913807" 11 <---- unintended graph
"b3sifp" 6
"http://purl.org/allie/id/longform/1443695" 6 <---- unintended graph
"b3sonto" 5
"http://purl.org/allie/id/longform/1430142" 4 <---- unintended graph
"http://purl.org/allie/id/longform/2232032" 4 <---- unintended graph
"http://www.w3.org/ns/ldp#" 3
"urn:rules.skos" 2
Why does this kind of thing happen?
On 19 July 2017 at 04:13, Tofyoumi Fujiwara [email protected] wrote:
After one or two weeks later, I got the following graphs which included unintended three graphs."http://purl.org/allie" 163457277 " http://data.allie.dbcls.jp/DAV/" 4667 "http://www.openlinksw.com/ schemas/virtrdf#" 2475 "http://www.openlinksw.com/schemas/oplweb#" 1791 "virtrdf-label" 644 "http://purl.org/allie/ontology/201108#" 173 " http://www.w3.org/2002/07/owl#" 160 "http://purl.org/allie/void" 44 " http://data.allie.dbcls.jp/sparql" 33 "facets" 32 " http://purl.org/allie/id/pair/2913807" 11 "b3sifp" 6 " http://purl.org/allie/id/longform/1443695" 6 <---- unintended graph "b3sonto" 5 "http://purl.org/allie/id/longform/1430142" 4 <---- unintended graph "http://purl.org/allie/id/longform/2232032" 4 <---- unintended graph "http://www.w3.org/ns/ldp#" 3 "urn:rules.skos" 2
Why does this kind of thing happen?
Off the top of my head, if you've been using the /sparql endpoint and made a query with an other other than "use only local data" selected, that could slurp other resources. There are similar pragmas for use within a single SPARQL query that achieve the same effect - see mentions of "grab" in http://docs.openlinksw.com/virtuoso/rdfsparqlimplementatioptragmas/ . If you also have the Sponger (RDF Cartridges) installed then they could quite easily become new graphs.
There may also be other ways.
~Tim
Tim Haynes Product Development Consultant OpenLink Software http://www.openlinksw.com/ http://twitter.com/openlink
@openlink Hello, and thanks for your help. I've always been making a query with "use only local data". And furthermore, I've not installed the Sponger (RDF Cartridges).
It seems I am facing the same bug in production. Were the GRAPH IRI valid IRIs of instances of your data ?
@fujitoyo, @serasset —
Given that this issue was opened in July 2017, it probably makes sense to close it and start fresh, if the odd graph creation can be recreated with current Virtuoso and VADs. We would need fairly explicit steps to recreate the issue ourselves and/or links to your instance(s), if network accessible.
It was indeed present using the latest docker image and I created a git to demonstrate it as the data needed to be big enough for this quite random bug to appear. Doing so, I think I understood the problem:
- In the faulty db, SPARQL user was granted SPONGE role,
- From time to time, a SPARQL describe received from a client will apparently fail on data that is otherwise available on the server
- Then, as it is allowed to SPONGE, it seems the server will try to fetch information from the queried URI (the server does not realize he is the authoritative server for the URI as it run as localhost:8990 through a ProxyPass on the public web server),
- So the server asks itself for information about the queried URI and it succeeds and fetches the queried information from itself,
- This information is sponged and as it is external it seems it is added in the DB in a graph that is named according to the queried URI.
So the symptom was that graphs bearing the name of existing nodes were created out of the blue.
If somebody wants to further debug this, I can give the docker configuration that reproduces the problem (here or on another issue).
Removing SPONGE role from SPARQL fixed the issue.