spark-dgraph-connector
spark-dgraph-connector copied to clipboard
A connector for Apache Spark and PySpark to Dgraph databases.
Having the sources report their partitioning to Spark allows Spark to exploit the existing partitioning and avoid shuffling all data for operations that require the existing partitioning. For instance, reading...
Let's say I had a very large graph of IDs->commited->CommitNodes->modified->FileNodes where my edges are reversable. Commitnodes have a author_date that is a datetime type in the DGraph schema. (ID's and...
Allow to specify a namespace in order to read graphs from namespaces other than the default one. spark.read.option("dgraph.namespace", "foo").dgraph.triples("localhost:9080") The namespace will have to be used when creating a transaction...
Each partition could fetch latest health information to identify alphas that should not be used. This avoids connecting to un-healthy alpha nodes while reading partition data. Health information could be...
The connector partitions the graph to allow Spark to read it in parallel. But Spark does not know anything about the partitioning. Say the connector partitions the graph by predicate...
The node source in wide mode has a column for each predicate. With language strings, each language of each predicate requires its own column, which needs to be known upfront....
In #144 we have seen that using the connector with GraphFrames in PySpark needs quite a bit of code, where in scala it is a single line of code. Add...
Hi, When I run the following code: ``` ./spark-shell --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12,uk.co.gresearch.spark:spark-dgraph-connector_2.12:0.7.0-3.1 import uk.co.gresearch.spark.dgraph.graphframes._ import org.graphframes._ val graph: GraphFrame = spark.read.dgraph.graphframes("localhost:9080") graph.vertices.show() ``` I get the following error: ``` 21/10/18 12:12:18...
Spark can write `DataFrames` to sources in two modes: override (erasing everything first) and append. Appending to (mutating) the Dgraph database would be great.