sparql.anything Is possible to generate BlankNodes from data references?

The behavior should be similar to the one in RML:

@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix ex: <http://example/> .
@prefix : <http://example.org/> .
@base <http://example.org/> .

:firstTM a rr:TriplesMap ;
    rml:logicalSource [
        rml:source "data.csv";
        rml:referenceFormulation ql:CSV
    ];
    rml:subjectMap [
        rml:reference "c1" ;
        rr:termType rr:BlankNode
    ];
    rr:predicateObjectMap [
        rr:predicate ex:p ;
        rml:objectMap [
            rr:template "http://example/{c2}"
        ]
    ] .

Input

c1,c2
b0,A

Output:

 _:b0 ex:p ex:A

Jul 13 '22 12:07 dachafra

You can just construct bnodes:

PREFIX ex: <http://example/> 
PREFIX fx:  <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>

CONSTRUCT {
 [] ex:p ?A
} WHERE {
 SERVICE <x-sparql-anything:> {
	fx:properties fx:location "./data.csv" ; fx:csv.headers true .
 	[] xyz:c2 ?A
 }
}

or, if you want to control the bnode identifier for some reason:

PREFIX ex: <http://example/> 
PREFIX fx:  <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>

CONSTRUCT {
 ?bnode ex:p ?A
} WHERE {
 SERVICE <x-sparql-anything:> {
	fx:properties fx:location "./data.csv" ; fx:csv.headers true .
 	[] xyz:c1 ?b0 ; xyz:c2 ?A
 }
 BIND ( BNODE ( ?b0 ) as ?bnode ) 
}

Jul 13 '22 15:07 enridaga

I've arrived at this point, yes, but you can not take the identifier of the BN from the input source, right?

Jul 14 '22 13:07 dachafra

I've arrived at this point, yes, but you can not take the identifier of the BN from the input source, right?

You can take it from there, as you see in the second query. I am not sure I get the use case here. Do you mean that you want to keep blank node identifier in the generated graph? The generated blank node ids depend on the serialiser. BNode identifiers are supposed to be local and are usually generated during serialisation or during data loading. So, what's the point of forcing them? If you want to mint an identifier, you probably want an IRI instead. Am I getting it right?

Jul 14 '22 16:07 enridaga

you could do this:

curl --silent 'http://localhost:3000/sparql.anything'  \
--header "Accept: text/csv" \
--data-urlencode 'query=
PREFIX  fx:   <http://sparql.xyz/facade-x/ns/>
SELECT  *
WHERE
  { SERVICE <x-sparql-anything:>
      { fx:properties
                  fx:location     "/app/input.csv" ;
                  fx:csv.headers  true .
        ?s        ?p              ?o
        BIND(iri(?s) AS ?s_iri)
      }
  }
'

yielding:

s	p	o	s_iri
_:b0	http://sparql.xyz/facade-x/data/c1	b0	_:file:/app/input.csv##row1
_:b0	http://sparql.xyz/facade-x/data/c2	A	_:file:/app/input.csv##row1
_:b1	http://www.w3.org/1999/02/22-rdf-syntax-ns#type	http://sparql.xyz/facade-x/ns/root	_:file:/app/input.csv#
_:b1	http://www.w3.org/1999/02/22-rdf-syntax-ns#_1	_:b0	_:file:/app/input.csv#

Jul 14 '22 23:07 justin2004

oh, i know what you want now. one minute.

Jul 14 '22 23:07 justin2004

it appears that apache jena does not let you synthesize a bnode identifier manually. this is as close as i can get but neither quad is what you are looking for (one isn't a well formed quad and i'm not sure about the other). though i think an actual IRI is what i would use in practice.

curl --silent 'http://localhost:3000/sparql.anything'  \
--header "Accept: application/n-quads" \
--data-urlencode 'query=
PREFIX  :     <http://example.com/>
PREFIX  xyz:  <http://sparql.xyz/facade-x/data/>
PREFIX  fx:   <http://sparql.xyz/facade-x/ns/>
CONSTRUCT 
  { 
    ?new_s_iri :p ?new_c2 .
    ?new_s_str :p ?new_c2 .
  }
WHERE
  { SERVICE <x-sparql-anything:>
      { fx:properties
                  fx:location     "/app/input.csv" ;
                  fx:csv.headers  true .
        ?s        xyz:c1          ?c1 ;
                  xyz:c2          ?c2
        BIND(iri(concat("_:", ?c1)) AS ?new_s_iri)
        BIND(concat("_:", ?c1) AS ?new_s_str)
        BIND(iri(concat(str(:), ?c2)) AS ?new_c2)
      }
  }
'

yields:

"_:b0" <http://example.com/p> <http://example.com/A> .
<_:b0> <http://example.com/p> <http://example.com/A> .

Jul 14 '22 23:07 justin2004

@justin2004 yeah, exactly! I was able to obtain the same results, but I don't think that any of the results are valid RDF, right?

For letting you know, this is coming from this R2RML test-cases: https://www.w3.org/2001/sw/rdb2rdf/test-cases/#R2RMLTC0002b. It is not that I specifically want to have this feature in the engine but it is more for comparing both solutions. One of the main benefits of having this feature is that identifiers do not have to be maintained in memory during the execution.

Jul 15 '22 08:07 dachafra

I don't think it is possible to control the blank nodes that are generated by the serializer, but this is probably a question for [email protected].

However, while playing with this use case I found an interesting issue when one wants to generate multiple triples with the same bnode on different construct template projections. At the moment, a new bnode is generated for every projection, even if we use the BNODE function. This is reproducible by adding more rows to the example CSV. A new bnode is created for each one of them. I will open a separate issue for that.

Jul 15 '22 08:07 enridaga

At the moment, a new bnode is generated for every projection, even if we use the BNODE function.

I thought I just wasn't understanding how to use bnode() with an argument but since you might have also expected different behavior I opened an issue: https://issues.apache.org/jira/browse/JENA-2340

Jul 15 '22 12:07 justin2004

For letting you know, this is coming from this R2RML test-cases: https://www.w3.org/2001/sw/rdb2rdf/test-cases/#R2RMLTC0002b. It is not that I specifically want to have this feature in the engine but it is more for comparing both solutions.

Considering they are bnodes, the comparison can be done via graph isomorphism (there are some useful utils for this in Jena).

Jul 18 '22 08:07 enridaga

sparql.anything sparql.anything copied to clipboard

Is possible to generate BlankNodes from data references?

sparql.anything
sparql.anything copied to clipboard