grlc icon indicating copy to clipboard operation
grlc copied to clipboard

Invalid queries if varible name is substring of another variable name

Open jaw111 opened this issue 3 years ago • 2 comments

Given a query like where a variable is a substring of another variable name

select *
where {
  [] rdfs:label ?__label ;
    skos:prefLabel ?__label2 .
}

If the request URL includes the label parameter e.g. ?label=foo then the resulting query is invalid. Whereby the string ?__label is replaced by "foo":

select *
where {
  [] rdfs:label "foo" ;
    skos:prefLabel "foo"2 .
}

The logic for rewriting the queries should be more robust than simply replacing strings in the query text to account for this.

jaw111 avatar Aug 31 '21 15:08 jaw111

Hi @jaw111! Interesting issue, I don't think we've ever come across this sort of use case before. We've been thinking for a while that the variable replacement code should be upgraded, to overcome issues such as #230, so maybe this is something to be taken into account as well.

The only thing that I can think of, is to do string replacement, starting from the longest variable name (?__label2 in your example above). So something along these lines should do the trick:

def doReplace(s, vals):
    # Start replacing longest variable names
    for key in sorted(vals.keys(), key=len, reverse=True):
        s = s.replace(key, vals[key])
    return s

This is not very sophisticated, but if you are aware of any more elegant algorithm to address this issue, we are open to suggestions :-)

c-martinez avatar Sep 21 '21 20:09 c-martinez

@c-martinez I like your suggestion, it's nice and simple :)

My other thought was that, as the query is being translated into the SPARQL Algebra Expression with rdflib, it should be possible to programmatically manipulate that expression to replace the variable by the RDF term and reserialize back to text. That might be more complex than manipulating the query text as a string, but should be a more robust approach.

Another approach would be to construct a VALUES clause with the bindings for the relevant variables and simply append that to the query text as suggested in #332.

jaw111 avatar Sep 29 '21 07:09 jaw111