rdflib icon indicating copy to clipboard operation
rdflib copied to clipboard

sparql.update: Why (not) initBindings? Not in specs and ignored in prepareUpdate/processUpdate

Open usalu opened this issue 2 years ago • 12 comments

Here is an example of what I wanted to do with processUpdate:

from rdflib import Graph, URIRef
from rdflib.namespace import FOAF
from rdflib.plugins.sparql import processUpdate

g = Graph()
q = processUpdate(g,"INSERT DATA { ?person foaf:knows <http://www.w3.org/People/Berners-Lee> .}",
    {'person':URIRef("http://www.w3.org/People/Berners-Lee")},{ "foaf": FOAF})
g.serialize()
#'@prefix foaf: <http://xmlns.com/foaf/0.1/> .\n\n?person foaf:knows <http://www.w3.org/People/Berners-Lee> .\n\n'

which followed analogue the prepareQuery from the intro:

q = prepareQuery(
    "SELECT ?s WHERE { ?person foaf:knows ?s .}",
    initNs = { "foaf": FOAF }
)

g = rdflib.Graph()
g.parse("foaf.rdf")

tim = rdflib.URIRef("http://www.w3.org/People/Berners-Lee/card#i")

for row in g.query(q, initBindings={'person': tim}):
    print(row)

Is this not supported because Sparql 1.1 Update has no BIND inside its specs and initBindings is it just a leftover because sparql.update has been developed after sparql.query which uses the bindings in evalPart but evalUpdate doesn't?

Or is it usefull for something else?

The example I showed is silly because I could just do a simple add but my idea was to do graph rewriting/transformation on rdf by generating templates and bindings. Is there a way to do this natively or should I better use a tool like jinja?

usalu avatar Aug 25 '23 00:08 usalu

I don't really know what initBindings is for off-hand, maybe nothing useful. Someone will have to do some archaeology and hopefully update the docs.

In general, I would not rely on this though, and rather use a different template, or even python string templates (f-strings, or .format()), if possible.

In an ideal world, RDFLib's SPARQL will have bindable prepared-query support like you have for normal SQL, where you can prepare a query and then bind values to it later, but this is probably not going to be around soon.

I think we could potentially consider adding a SPARQL-quoted wrapper type, which when serialized to string serializes the contained term in a SPARQL safe manner.

So you would them have something like this:

from rdflib._contrib import SPARQLQuoted as SQ

tim = rdflib.URIRef("http://www.w3.org/People/Berners-Lee/card#i")

query = f"SELECT ?s WHERE {{ {SQ(tim)} foaf:knows ?s .}}"
assert query == "SELECT ?s WHERE { <http://www.w3.org/People/Berners-Lee/card#i> foaf:knows ?s .}"

aucampia avatar Aug 25 '23 20:08 aucampia

I just spent hours over the last week trying to figure out why graph.update(insert_query, initBindings={...}) didn't work, even wrote tests to prove it... (see #2555)

Now it seems that apparently it's not in the spec, and the code only suggested it because update was copypasta'd from query? 😭

sivy avatar Aug 29 '23 02:08 sivy

BTW @aucampia - rdflib's SPARQL does support SQL-style parameters/bindings for SELECT and I assume anything handled by Graph.query():

import rdflib
from rdflib.namespaces import FOAF

joe = rdflib.URIRef('Joe')

g = rdflib.Graph()
g.set((joe, "a", FOAF.Person))

for row in g.query("SELECT ?what WHERE { ?who a ?what }", initBindings={"who": joe}):
   print(row)

(That's from memory but I've been deep in pdb for several days tracing this stuff)

sivy avatar Aug 29 '23 02:08 sivy

Thank you for all the answers! I had this feeling about it but just wanted to make sure that there was no other way of making this work.

I guess SPARQLUpdate was "new" and the SPARQL implementation was already done. As SPARQLUpdate is only an extension of SPARQL 1.1 and not (yet?) part of something like SPARQL 2.1, you have as a developer two choices:

You either have to raise abstraction and reimplement something that already works (long live unittests) or just say: Ah lets use sparql as a template and then adjust it.

Further if BIND is not even part of the specs, then by implementing a feature (that seems usefull) like that, you potentially lock in people which is what the standard is trying to fight in the first place. Standard(s) problem.

I guess there won't be a significant amount of new sparqls coming, so future duplication is not that extreme. Doens't feel entirely satisfying but definitely reasonable. After all it works great!

usalu avatar Aug 30 '23 10:08 usalu

Is this not supported because Sparql 1.1 Update has no BIND inside its specs

@usalu I don't think this is correct. BIND is just defined as part of the Query spec.

I think what you want for your use case is VALUES.

namedgraph avatar Aug 30 '23 11:08 namedgraph

@namedgraph Oh really, I can use BIND in SPARQLUpdate? That would be good news because it would spare me some SPARQLRule stuff. When I was trying to find the BIND in SPARQL 1.1 Update specs, I could only find semantics for grafik which gives some informative EBNF like snippet grafik but when I actually tried to find a valid grammar SPARQLUpdate it only refers to appendix B grafik which is something totatly different grafik but I guess they actually meant appendix C grafik which itself refers to SPARQL Query but there there is none of the SPARQL Update stuff grafik

Where did I miss the integration?

Yeah, VALUES is a great idea but I also thought, it wasn't part of SPARQL Update (same as BIND).

My original attempt was:

from rdflib import Graph
g = Graph()
g.update(
"""
PREFIX ns: <http://example.org/ns#>
INSERT DATA 
{   ?b ns:number_of_floors ?nof ; ns:floor_height ?fh .
    BIND (?nof*?fh AS ?height)
    ?b ns:height ?height .
    VALUES (?b ?nof ?fh){
        (ns:b1 5 3)
        (ns:b2 10 4)
    }
} """ # Yields some error

My second attempt was trying to get rid of VALUES through initBindings:

from rdflib import Graph, Namespace
NS = Namespace("http://example.org/ns#")
g = Graph()
q = prepareUpdate(
"""
INSERT DATA 
{   ?b ns:number_of_floors ?nof ; ns:floor_height ?fh .
    BIND (?nof*?fh AS ?height)
    ?b ns:height ?height .
}""",
    initNs = { "ns": NS}
)
for b in [{"id":"b1","nof":5,"fh":3},
          {"id":"b2","nof":10,"fh":4}]:
    g.update(q, initBindings={
        "b":NS.b["id"],
        "nof":NS.b["nof"],
        "fh":NS.b["fh"]}) # Yields some error
g.serialize()

and my third attempt was to get rid of BIND with SPARQLRules

ns:BuildingShape
	a sh:NodeShape ;
	sh:targetClass ns:Building ;
	sh:property [
		sh:path ns:number_of_floors ;
		sh:datatype xsd:integer ;
		sh:minCount 1 ;
		sh:maxCount 1 ;
	] ;
	sh:property [
		sh:path ns:floor_height ;
		sh:datatype xsd:float ;
		sh:minCount 1 ;
		sh:maxCount 1 ;
	] .
    
ns:BuildingRulesShape
	a sh:NodeShape ;
	sh:targetClass ns:Rectangle ;
	sh:rule [
		a sh:SPARQLRule ;
		sh:prefixes ns: ;
		sh:construct """
			CONSTRUCT {
				$this ns:height ?height .
			}
			WHERE {
				$this ns:number_of_floors ?nof .
				$this ns:floor_height ?fh .
				BIND (?nof * ?fh AS ?height) .
			}
			""" ;
		sh:condition ns:BuildingShape ; 
	] ;
.

but actually despite pyshacl supporting inference, there is no builtin to access the inferenced triples because it is mainly a validation engine and not an inference system, other than hacking.

Can you provide me a SPARQL Update example that uses BIND or VALUES?

Please let me know if I am missing something or I am understand something wrong!

usalu avatar Aug 30 '23 18:08 usalu

@usalu you need a WHERE clause to be able to use BIND, so the INSERT DATA form will not work. In general INSERT DATA is for concrete triples without variables, AFAIK.

Try the INSERT ... WHERE form, for example:

PREFIX  ns:   <http://example.org/ns#>

INSERT {
  ?b ns:height ?height .
}
WHERE
  { ?b  ns:number_of_floors  ?nof ;
        ns:floor_height      ?fh
    BIND(( ?nof * ?fh ) AS ?height)
    VALUES ( ?b ?nof ?fh ) {
      ( ns:b1 5 3 )
      ( ns:b2 10 4 )
    }
  }

namedgraph avatar Aug 30 '23 18:08 namedgraph

@namedgraph Wow, this is a form I hadn’t seen yet, and will come in handy!!

sivy avatar Aug 30 '23 18:08 sivy

They're all in the specs ;)

P.S. check out https://kgdev.net, we tried to collect them in one place and structure then

namedgraph avatar Aug 30 '23 18:08 namedgraph

Oh I'm sure they are, and I've been reading the specs, but as good as the specs are, they could use more examples of how the various things combine IMO 😁 - thanks for the info!

sivy avatar Aug 30 '23 18:08 sivy

@namedgraph Thank you so much! Ah I thought because of the examples for BIND grafik and VALUES grafik that it would work inside.

from rdflib import Graph

g = Graph()
g.update(
"""
PREFIX  ns:   <http://example.org/ns#>

INSERT {
  ?b ns:height ?height .
}
WHERE
  { ?b  ns:number_of_floors  ?nof ;
        ns:floor_height      ?fh
    BIND(( ?nof * ?fh ) AS ?height)
    VALUES ( ?b ?nof ?fh ) {
      ( ns:b1 5 3 )
      ( ns:b2 10 4 )
    }
}""")
g.serialize() #Empty graph

conforms for first time actually to the sparql algebra of rdflib 🎉

I still get an empty graph tho ._.

usalu avatar Aug 30 '23 18:08 usalu

@usalu try making a SELECT query with this WHERE first and see if you get results. No results means there would be no bindings for the update.

namedgraph avatar Aug 30 '23 19:08 namedgraph