spark-dgraph-connector issues

Make sources report their partitioning to Spark

3

Having the sources report their partitioning to Spark allows Spark to exploit the existing partitioning and avoid shuffling all data for operations that require the existing partitioning. For instance, reading...

EnricoMi

Question about subgraphs/filtering

9

Let's say I had a very large graph of IDs->commited->CommitNodes->modified->FileNodes where my edges are reversable. Commitnodes have a author_date that is a datetime type in the DGraph schema. (ID's and...

daveaitel

question

Support Dgraph namespaces

Allow to specify a namespace in order to read graphs from namespaces other than the default one. spark.read.option("dgraph.namespace", "foo").dgraph.triples("localhost:9080") The namespace will have to be used when creating a transaction...

EnricoMi

enhancement

Support authorization

Add support for Dgraph authorization.

EnricoMi

enhancement

Consider health information

Each partition could fetch latest health information to identify alphas that should not be used. This avoids connecting to un-healthy alpha nodes while reading partition data. Health information could be...

EnricoMi

enhancement

Make Spark know the partitioning of the read data

The connector partitions the graph to allow Spark to read it in parallel. But Spark does not know anything about the partitioning. Say the connector partitions the graph by predicate...

EnricoMi

enhancement

Add language support for wide node mode

2

The node source in wide mode has a column for each predicate. With language strings, each language of each predicate requires its own column, which needs to be known upfront....

EnricoMi

enhancement

Add support for GraphFrames and GraphX to PySpark API

In #144 we have seen that using the connector with GraphFrames in PySpark needs quite a bit of code, where in scala it is a single line of code. Add...

EnricoMi

enhancement

python

WideNodeEncoder fails on multiple values per predicate

3

Hi, When I run the following code: ``` ./spark-shell --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12,uk.co.gresearch.spark:spark-dgraph-connector_2.12:0.7.0-3.1 import uk.co.gresearch.spark.dgraph.graphframes._ import org.graphframes._ val graph: GraphFrame = spark.read.dgraph.graphframes("localhost:9080") graph.vertices.show() ``` I get the following error: ``` 21/10/18 12:12:18...

RJKeevil

enhancement

Add write support to connector

9

Spark can write `DataFrames` to sources in two modes: override (erasing everything first) and append. Appending to (mutating) the Dgraph database would be great.

EnricoMi

enhancement

spark-dgraph-connector
spark-dgraph-connector copied to clipboard

Metadata

Make sources report their partitioning to Spark

Question about subgraphs/filtering

Support Dgraph namespaces

Support authorization

Consider health information

Make Spark know the partitioning of the read data

Add language support for wide node mode

Add support for GraphFrames and GraphX to PySpark API

WideNodeEncoder fails on multiple values per predicate

Add write support to connector

← Metadata

Owner

Metadata

spark-dgraph-connector spark-dgraph-connector copied to clipboard

Metadata

← Metadata

Owner

Metadata

spark-dgraph-connector
spark-dgraph-connector copied to clipboard