ProjectDomino icon indicating copy to clipboard operation
ProjectDomino copied to clipboard

Dynamic Neo4j large ~parquet export

Open lmeyerov opened this issue 5 years ago • 2 comments

Tracks current effort to get Neo4j to export ~100M node/edge parquet/arrow graphs in decent time for use by analytics stacks

This is for fast on-the-fly mode: dynamic cypher query -> parquet/arrow

lmeyerov avatar Mar 30 '20 22:03 lmeyerov

If there's a WIP branch anywhere I'd be keen to take a look. Unsure if this piece was ever started

vilkinsons avatar Apr 06 '21 13:04 vilkinsons

Someone at Neo4j started this but I don't think they made progress

I'm not sure of the current state in Neo4j land. One of my guesses was, due to Neo4j's Spark connector (cypher query ->N eo4j -> Spark RDD?), there might be a typed bulk exporter, and as Spark is already Arrow-friendly, we can coopt it

lmeyerov avatar Apr 06 '21 17:04 lmeyerov