spark-dgraph-connector
spark-dgraph-connector copied to clipboard
Add write support to connector
Spark can write DataFrames
to sources in two modes: override (erasing everything first) and append. Appending to (mutating) the Dgraph database would be great.
Has there been any progress on this issue?
I thought @EnricoMi added write support as he worked through the https://github.com/G-Research/dgraph-dbpedia/ project. But looking at this again, I think it still needs doing.
@EnricoMi I know you're off at the moment, but can you clarify whether this support made it in?
@daveaitel @stackedsax write is not support yet and is definitively a bigger piece of work. And I suspect it won't scale nicely, so don't expect huge write performance.
Thanks for confirming, Enrico. @daveaitel, what did you have in mind here?
Mostly I want to connect my DGraph DB to SPARK, have SPARK run its PageRank/etc algorithms on it, and then update the DGraph database with that information. Is there a better way to do that?
-dave
@daveaitel so that would mean to write / update a single value per node and modifying any edges. That should scale nicely.
Alternative is of course to use the non-scaling traditional pipeline of writing the PageRank scores into a Dgraph compatible RDF file and use the Dgraph live loader. Of course, writing from Spark directly means a much smaller pipeline.
Right but from what we are saying in this thread this is not currently possible, because the connection cannot do writes ?
On Sat, Sep 18, 2021, 11:56 PM Enrico Minack @.***> wrote:
@daveaitel https://github.com/daveaitel so that would mean to write / update a single value per node and modifying no edge. That should scale nicely.
Alternative is of course to use the non-scaling traditional pipeline of writing the PageRank scores into a Dgraph compatible RDF file and use the Dgraph live loader. Of course, writing from Spark directly means a much smaller pipeline.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/G-Research/spark-dgraph-connector/issues/8#issuecomment-922425315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE25MYUZIRUJCOTXDMAP4KDUCWCTDANCNFSM4N56U5MA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
That is right, writing to Dgraph from Spark is not supported.