sparql.anything String.split() and iterations in general

The use case is that I am dealing with a CSV whose values themselves are sometimes comma-separated strings, e.g.

"Fotomontaggio","Biennale di Venezia 1978","Carta, Carta da lucido, Inchiostro , Grafite, Matita colorata"

Ideally, I would want to call a wrapper function of String.split() on the last value and generate a URI for each material. However, we know SPARQL is not well-suited for loops and iterators. At the same time this may be an expectedly frequent need for someone who wants to use SPARQL-Anything's CONSTRUCT to re-engineer data into RDF.

One can think of generating a structure similar to a VALUES table, but at execution time and with some clever planning to ensure they are evaluated before the respective bindings are generated.

Would this be worth discussing?

Jan 12 '22 15:01 alexdma

That's a great proposal! Indeed, support for functions that return a single output is easy in SPARQL. In the past, I have used the Jena magic property apf:strSplit [1] but the approach is every cumbersome. Essentially the variable which is the subject of the triple is evaluated several times for each one of the elements of the array - it is impossible to know the order.

Actually, it is not impossible but very very cumbersome, see this, for example:

                {
			?val apf:strSplit ( ?match3 "/" ) . 
			BIND ( fx:serial("title") as ?check ) .
			FILTER ( ?check = 1 ) .
			BIND ( ?val as ?title ) .
		} UNION {
			?val apf:strSplit ( ?match3 "/" ) . 
			BIND ( fx:serial("year") as ?check ) .
			FILTER ( ?check = 2 ) .
			BIND ( ?val as ?year ) .
		}

The above snippet calls the function many times but each block catches only the Nth element (the fx:serial is conveniently used to check the correct binding). Really ugly, also because there is no absolute certainty that ?val bindings will come in the right order (partial query solutions are not meant to be sorted, for what I know).

In summary, this is definitely worth exploring :)

[1] https://jena.apache.org/documentation/query/library-propfunc.html

Jan 13 '22 12:01 enridaga

One can think of generating a structure similar to a VALUES table, but at execution time and with some clever planning to ensure they are evaluated before the respective bindings are generated.

This is intriguing, do you think this is possible without changing the SPARQL syntax? How do you think this would look like from the user point of view?

Jan 13 '22 12:01 enridaga

Hi, apologies for replying so late. I think that the solution proposed by @enridaga in #329 solves also this. Another possible solution would be using the StringTriplifier

For example, the following query

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX fx: <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>
PREFIX ex: <http://example.org/>
SELECT ?material
WHERE {
  SERVICE<x-sparql-anything:> {
    fx:properties fx:content "\"Fotomontaggio\",\"Biennale di Venezia 1978\",\"Carta, Carta da lucido, Inchiostro , Grafite, Matita colorata\"" .
    fx:properties fx:media-type "text/csv" .
    ?s fx:anySlot ?cell .

    SERVICE<x-sparql-anything:> {
        fx:properties fx:content ?cell .
        fx:properties fx:media-type "text/plain" .
        fx:properties fx:txt.split "," .
        ?s1 fx:anySlot ?material
    }

  }
}

gives

------------------------------
| material                   |
==============================
|                            |
| "Fotomontaggio"            |
| "Biennale di Venezia 1978" |
| "Carta"                    |
| " Carta da lucido"         |
| " Inchiostro "             |
| " Grafite"                 |
| " Matita colorata"         |
------------------------------

Feb 09 '23 13:02 luigi-asprino

Maybe better closing the issue and convert it to a discussion

Feb 09 '23 13:02 luigi-asprino

sparql.anything sparql.anything copied to clipboard

String.split() and iterations in general

sparql.anything
sparql.anything copied to clipboard