sparql.anything icon indicating copy to clipboard operation
sparql.anything copied to clipboard

Relational database

Open enridaga opened this issue 4 years ago • 8 comments

Support relational databases, for example, by developing a connector using JDBC.

enridaga avatar Dec 18 '20 16:12 enridaga

Developing R2RML mappings to Facade-X should in principle allow building a component that does query rewriting rather then transform the whole table prior to query execution (like we are doing currently with CSVs ...)

enridaga avatar Jan 16 '21 16:01 enridaga

However, D2RQ seems it is not active (Github project archived), also the other R2RML Java projects mentioned here. I wonder where to find a robust and maintained Java implementation ...

enridaga avatar Jan 16 '21 16:01 enridaga

Maybe you can join forces with https://ontop-vkg.org/ ?

akuckartz avatar May 08 '21 06:05 akuckartz

Ontop is a Virtual Knowledge Graph system. It exposes the content of arbitrary relational databases as knowledge graphs. These graphs are virtual, which means that data remains in the data sources instead of being moved to another database.

Ontop translates SPARQL queries expressed over the knowledge graphs into SQL queries executed by the relational data sources. It relies on R2RML mappings and can take advantage of lightweight ontologies. https://ontop-vkg.org/guide/

akuckartz avatar May 08 '21 07:05 akuckartz

Definitely something to try! The open question is whether mappings in R2RML can be defined at the meta level (for example, expressing things such as "for each table/column" without needing to actually encode the schema elements in the mappings. If this is possible, we could design mappings to Facade-X once for all and give to users access to any RDB on the fly.

enridaga avatar May 10 '21 08:05 enridaga

Hi, just some input which might be of interest here:

In SANSA we created an integration of ontop and sparqlify with Apache Spark. Disclaimer: I am the developer of the sparql-to-sql rewriter Sparqlify.

For this purpose we created this jena-based R2RML layer - which is just the R2RML tooling without the query rewriting (though it includes a simple ARQ-based materializing R2RML processor which succeeds on all R2RML test cases). For the ontop integration the jena model gets wrapped with commons-rdf from where ontop picks it up.

In any case, for the interlinking tool LIMES I once made a proposal (and prototype) which might be relevant here as well:

One could exploit nested service clauses to syntactically provide RDF-based mapping information:

SERVICE <x-sparql-anything:r2rml:ontop:jdbc:connection-string> {
  SERVICE <mapping:inline> { r2rml content goes here
    [ a                      rr:TriplesMap ;
      rr:predicateObjectMap  [  ... ] ]
  }
  SERVICE <query> { # query goes here
    { SELECT COUNT(*) { ?s ?p ?o }
  }

Of course, mapping could be provided externally using <mapping:http://somesource>.

The R2RML spec also defines a default mapping for relation database, called the direct mapping which I suppose is pretty much the recipe for creating default R2RML mappings. Hence, if no explicit mapping is provided, this is the one that can be generated by default. Internally, the extended sparql processor may cache the generated mapping with the connection string and use it whenever no other mapping is requested.

Aklakan avatar Jun 24 '21 21:06 Aklakan

The open question is whether mappings in R2RML can be defined at the meta level (for example, expressing things such as "for each table/column" without needing to actually encode the schema elements in the mappings.

i think R2RML needs rr:tableName but usually there is a table of tables (and a table of columns), right?

e.g. in postgres:

select * from pg_catalog.pg_tables;

if there is a common JDBC way to get at the table of tables then we could macro expand to produce the R2RML as needed. if there is not a common JDBC way to get at the table of tables then we could just make a big switch statement with a case for each flavor of RDB.

justin2004 avatar Sep 06 '21 01:09 justin2004

Development started on branch jdbc https://github.com/SPARQL-Anything/sparql.anything/tree/jdbc

enridaga avatar Dec 15 '22 11:12 enridaga