spark-dgraph-connector icon indicating copy to clipboard operation
spark-dgraph-connector copied to clipboard

A connector for Apache Spark and PySpark to Dgraph databases.

Results 52 spark-dgraph-connector issues
Sort by recently updated
recently updated
newest added

Having the sources report their partitioning to Spark allows Spark to exploit the existing partitioning and avoid shuffling all data for operations that require the existing partitioning. For instance, reading...

Let's say I had a very large graph of IDs->commited->CommitNodes->modified->FileNodes where my edges are reversable. Commitnodes have a author_date that is a datetime type in the DGraph schema. (ID's and...

question

Allow to specify a namespace in order to read graphs from namespaces other than the default one. spark.read.option("dgraph.namespace", "foo").dgraph.triples("localhost:9080") The namespace will have to be used when creating a transaction...

enhancement

Add support for Dgraph authorization.

enhancement

Each partition could fetch latest health information to identify alphas that should not be used. This avoids connecting to un-healthy alpha nodes while reading partition data. Health information could be...

enhancement

The connector partitions the graph to allow Spark to read it in parallel. But Spark does not know anything about the partitioning. Say the connector partitions the graph by predicate...

enhancement

The node source in wide mode has a column for each predicate. With language strings, each language of each predicate requires its own column, which needs to be known upfront....

enhancement

In #144 we have seen that using the connector with GraphFrames in PySpark needs quite a bit of code, where in scala it is a single line of code. Add...

enhancement
python

Hi, When I run the following code: ``` ./spark-shell --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12,uk.co.gresearch.spark:spark-dgraph-connector_2.12:0.7.0-3.1 import uk.co.gresearch.spark.dgraph.graphframes._ import org.graphframes._ val graph: GraphFrame = spark.read.dgraph.graphframes("localhost:9080") graph.vertices.show() ``` I get the following error: ``` 21/10/18 12:12:18...

enhancement

Spark can write `DataFrames` to sources in two modes: override (erasing everything first) and append. Appending to (mutating) the Dgraph database would be great.

enhancement