dgraph icon indicating copy to clipboard operation
dgraph copied to clipboard

support SPARQL

Open wangdsh opened this issue 4 years ago • 8 comments

I think SPARQL query is important. The reason can be seen here:
https://github.com/dgraph-io/dgraph/issues/1#issuecomment-266360074
Hope to support SPARQL query in 2020.

wangdsh avatar Dec 31 '19 05:12 wangdsh

1000% upvote, SPARQL is super important.

Reasons can be seen here:

Comment

marvin-hansen avatar Feb 26 '20 23:02 marvin-hansen

@marvin-hansen @wangdsh

Agree. Cypher benifits mainly the application layer and easy-to-use, while Gremlin has a richer community that no other graph DBs can compete with as far as I am concerned, Gremlin also has a richer expressivity to perform any arbitrary complex graph traversals as you have mentioned in a previous post. The problem with Gremlin is, as the query goes more complex, it is more difficult to perform ad-hoc/automatic optimizations, which I think is an essential point for any querable databases, optimizations are even more difficult for users without a solid database and graph traversal virtual machine backgound.

But the real origin comes from SPARQL, which has a clean, simple syntax and yet a powerful expressivity. Really hope to see SPARQL support, this will benifit both graph DB and semantic web communities, and probably, have a deeper influence over future web technology.

suesunss avatar Mar 09 '20 08:03 suesunss

The biggest challenge with SPARQL. Is that this language was made to support Triple Store databases which follows W3C standards. Dgraph isn't an RDF Triple Store despite using RDF (in the simplest and most raw format).

IMHO, it's kind of chaotic having to maintain the support of different languages with different standards and different requirements/needs.

Imagine the chaos it would be to maintain GraphQL, GraphQL+-, Gremlin, Cypher and SPARQL. On each new Dgraph feature would be herculean work to sync. Hard to synchronize. Even if we support just one, so we should abandon GraphQL+- in favor of the new lang (and redesign Dgraph) that we don't have control (if we add features to Dgraph, we should ask for the lang maintainer to add to that language specs).

I believe it would be easier for you to select the features you like most in SPARQL (or any other) and ask for support in Dgraph than to add another language. Or even suggest changes in the GraphQL+- syntax.

That's my two cents as a user

BTW, my opinion doesn't reflects what Dgraph in general thinks.

Extra example of the difference between Dgraph and SPARQL

SPARQL uses "PREFIX" which is linked to the Identifier that is stored in the RDF store format. This Identifier in Dgraph is converted to UIDs. So, to make this to work. It is necessary to sanitize the dataset. Thus, making incompatible with any other RDF stores (means that when you export the RDF you gonna need to revert the sanitize and also you need to rebuild yourself the Identifiers).

Also, the sanitize would need a "hacky" way to the keyword "PREFIX" to work. And one approach should be defined. e.g use dgraph's type system or edges to represent the PREFIX inside Dgraph.

MichelDiz avatar Mar 09 '20 18:03 MichelDiz

@MichelDiz

If Dgraph is neither a RDF Triple Store nor a quad store, what then is the underlying storage system?

Is everything reduced to key-value in badger?

And what was the discerning criterion to omit the foundation for the semantic web?

I understand with the lacking foundation of a triple or quad store, there is actually very little than can be done to fully support RDF & SPARQL.

What's the long-term vision of Dgraph?

I had a look at the roadmap, and I appreciate all the effort and dedication making things better, but I am just curious to know where this going in the long-run?

marvin-hansen avatar Mar 11 '20 21:03 marvin-hansen

  1. It is basically Posting Lists recorded in KV on BadgerDB. You can read more about it in the newly released paper https://github.com/dgraph-io/dgraph/blob/master/paper/dgraph.pdf

A RDF triple is basically a "KV" with an identifier.

Dgraph is a triple system, but not exactly a triple store per se.

  1. Yes.

  2. I am not sure. Because I was not present when Dgraph started. But I have a slight idea of ​​why.

Basically Dgraph was "mirroring" itself in GraphQL. And GraphQL had no specific mutation patterns other than JSON objects inside a mutation block. Perhaps the engineers who started the project with Manish had some familiarity with RDF (as they certainly took classes at the university with web semantics). And the RDF seemed to be an obvious choice for that moment. But not the whole package.

I have this slight idea after reading old commits. But I can ask Manish about it.

Anyway, GraphQL does not use web semantics so do we. And there was no demand for this feature. So the DB was maturing without web semantics. Even because, Dgraph is a DB aimed at common web services (like NoSQL is) and not Ontology or similar. Although you can do it, as any GraphDB is customizable. But it would not follow any specific standards. And you have to "fit" it in GraphQL+-.

I understand with the lacking foundation of a triple or quad store, there is actually very little than can be done to fully support RDF & SPARQL.

We can try to support JSON-LD. Which several RDF DBs can export their data. That's why I have opened some issues about this context https://github.com/dgraph-io/dgraph/issues/4897

And also https://github.com/dgraph-io/dgraph/issues/4898 https://github.com/dgraph-io/dgraph/issues/4915

All these are small steps to let users input data coming from RDF triple stores easily (I am studying the problems related to this). We could import JSON-LD and export a JSON file 99.9% similar to JSON-LD. Which is compatible with several tools out there.

What's the long-term vision of Dgraph?

We are discussing about it https://discuss.dgraph.io/t/dgraphs-new-versioning-scheme/6106/4

Do you mean SPARQL? I'm not sure. We have to finish the GraphQL specs support. There are a lot of things to be done to start a new adventure.

MichelDiz avatar Mar 11 '20 22:03 MichelDiz

Thank you.

Basically Dgraph was "mirroring" itself in GraphQL.

I suspected that a few times, but now it's official. There is no point in supporting SPARQL or JSON-LD given the available foundation and the actual goal. No need to add unnecessary complexity.

Dgraph is a DB aimed at common web services

Please stress that a bit more on the landing page & at the intro of the documentation to make it clear what it is and what's not.

Thank you for the clarification.

marvin-hansen avatar Mar 11 '20 23:03 marvin-hansen

We can still have totally support for JSON-LD. Cuz users can bring their data in. And if they wish, they can "sanitize" it using Bulk Upsert mutation. There is nothing hard in JSON-LD that we can't deal with.

MichelDiz avatar Mar 11 '20 23:03 MichelDiz

Github issues have been deprecated. This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing

minhaj-shakeel avatar Jul 20 '20 19:07 minhaj-shakeel

This issue is not closed - it's just moved - what a pitty. I'd love to join the discussion but https://discuss.dgraph.io/t/hope-to-support-sparql-query-in-2020/8809 is just not up to the task - the Discuss UI is IMHO horrible! Please get back to github.

WolfgangFahl avatar Aug 10 '20 16:08 WolfgangFahl

@MichelDiz - great that this is reopened. I might actually try dgraph again now.

WolfgangFahl avatar Sep 05 '22 08:09 WolfgangFahl

Any update on sparql support?

mediaprophet avatar Jan 07 '23 06:01 mediaprophet

Nope, this might take really longer. To support GraphQL took 1 year and a half of intensive work with a dedicated team. It would be nice to have support from the community. Perhaps researching the best approach helps advance planning. But so far there is zero work or research on sparql.

MichelDiz avatar Jan 07 '23 07:01 MichelDiz