age
age copied to clipboard
How to export graph data for Giraph analysis?
Hi,
Is there some way to export graph data generated by AGE? So the data can be analyzed by Giraph.
Thanks, Steven
To export graph created by AGE, we can use the postgres COPY command and export the data to a CSV file, this is how the syntax looks like.
COPY ( SELECT * FROM <table name> ) TO 'absolute/path/to/export.csv' WITH CSV HEADER;
For example, supposing we have a graph with edges and vertex like the following below,
SELECT create_graph('library');
SELECT * FROM cypher('library', $$ CREATE (n:Book {name : "Purple Hibiscus"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ CREATE (n:Book {name : "A Time to Kill"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ CREATE (n:Book {name : "If Tomorrow Comes"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ CREATE (n:Author {name : "Chimamanda Adichie"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ CREATE (n:Author {name : "Sidney Sheldon"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ CREATE (n:Author {name : "John Grisham"}) $$) AS (a agtype);
SELECT * FROM cypher('library', $$ MATCH (a:Book), (b:Author)
WHERE a.name = 'Purple Hibiscus' AND b.name = 'Chimamanda Adichie' CREATE (a)-[e:WrittenBy]->(b)RETURN e $$) as (e agtype);
SELECT * FROM cypher('library', $$ MATCH (a:Book), (b:Author)
WHERE a.name = 'A Time to Kill' AND b.name = 'John Grisham'
CREATE (a)-[e:WrittenBy]->(b) RETURN e $$) as (e agtype);
SELECT * FROM cypher('library', $$ MATCH (a:Book), (b:Author)
WHERE a.name = 'If Tomorrow Comes' AND b.name = 'Sidney Sheldon' CREATE (a)-[e:WrittenBy]->(b) RETURN e $$) as (e agtype);
Now, Postgres COPY command exports tables or SQL expressions to CSV, creating a graph in AGE creates different tables for edges, graphs and labels.
To view all the tables in one graph, you can use the command below in a psql shell \dt library.*
.
This will output the following for the graph created above.
Schema | Name | Type | Owner
---------+------------------+-------+---------
library | Author | table | chidera
library | Book | table | chidera
library | WrittenBy | table | chidera
library | _ag_label_edge | table | chidera
library | _ag_label_vertex | table | chidera
To export an edge or vertex, Its schema and the table name is used.
COPY ( SELECT * FROM library._ag_label_vertex ) TO '/mnt/c/Users/HP/ex.csv' WITH CSV HEADER;
To export the edges and vertices, an SQL expression can be used.
COPY (
SELECT v1.id AS "first_vertex_id", v1.properties AS "first_vertex_properties",
e.id AS "edge_id", e.start_id, e.end_id, e.properties AS "edge_properties",
v2.id AS "second_vertex_id", v2.properties AS "second_vertex_properties"
FROM library._ag_label_vertex AS v1
JOIN library._ag_label_edge AS e ON v1.id = e.start_id
JOIN library._ag_label_vertex AS v2 ON e.end_id = v2.id
) TO '/mnt/c/Users/HP/data.csv' WITH CSV HEADER;
Here are some of the steps you can follow to export data generated by Apache Age for analysis with Apache Giraph.
Let's say you have a graph that represents a set of teachers and their status. The teacher node has properties like "teacher_id", "name", "age", while the status edges will have properties like: "status_id", and "status".
To export this graph data from Apache AGE for analysis with Apache Giraph, you can follow these steps:
Export the data as a separate CSV file:
-
Use the below query to retrieve the Teacher's data from the tables in Apache Age and export it to a CSV file. For example, you will execute a query like:
COPY (SELECT teacher_id, name, age FROM teacher) TO '/path/to/teachers.csv' CSV HEADER;
Similarly, exporting the status data as a separate CSV file:
COPY (SELECT status_id, status FROM status) TO '/path/to/status.csv' CSV HEADER;
-
Now transport the above-exported data into an edge list.
- In Apache Giraph the data is typically represented as an edge list, where each line represents an edge between 2 nodes. You need to transform the exported data from apache age into this format.
- Create a new file graph.txt and write the transformed data in the edge list format. each line in the file would represent a status edge between 2 teachers.
- The file will look something like this:
-
teacher1, teacher2
teacher1, teacher3
teacher2, teacher3
-
Now import the transformed data into the Apache Giraph. You can use different tools of Giraph for analyzing the graph data.
Thanks to Chidera and Talha! I will take a try.
@bigplaice Hi Steven, any update if you were able to get this to work?
Hi all, please don't use ChatGPT to write comments
You can use the following instructions to load data from the file(csv).
Users can load the graph in two steps:
- Load Vertices in the first step
- Load Edges in the second step
function load_labels_from_file is used to load vertices from the CSV files.
load_labels_from_file('<graph name>',
'<label name>',
'<file path>')
Function load_edges_from_file can be used to load properties from the CSV file. Please see the file structure in the following.
load_edges_from_file('<graph name>',
'<label name>',
'<file path>');
Now for exporting the data, you have to use the pgsql built-in functions. An example has been mentioned below:
COPY ( SELECT * FROM library._ag_label_vertex ) TO '/C/Users/Waleed Ahmed Shahid/.csv' WITH CSV HEADER;
( SELECT * FROM (tablename ) TO '/path/to/export.csv' WITH HEADER; use this command
You can export the graph data from AGE in a CSV file which you can later import into Giraph. This link shows how to read CSV file in Giraph. To export the data in CSV you can use the COPY command in Postgresql
COPY ( SELECT * FROM library._ag_label_vertex ) TO 'path_to_csv' WITH CSV HEADER;
@bigplaice Hi Steven, any update if you could get this to work?
@bigplaice Hi Steven, any update if you could get this to work?
actually not yet. My project of graph database is freezed before I rush into verification. And probably would not get chance to restart the project. Thanks for help from all of you guys!