age
age copied to clipboard
load_labels_from_file produced duplicated graphId
Describe the bug load_labels_from_file produced duplicated graphId
How are you accessing AGE (Command line, driver, etc.)?
- psql
What data setup do we need to do?
SELECT * FROM ag_catalog.create_graph('graph1');
SELECT create_vlabel('graph1', 'Person');
----------
1.first time import person.csv,which contains:
name,age
AAA,12
BBB,32,
CCC,33
select load_labels_from_file('graph1', 'Person', '/home/postgres/ws/testdata/person.csv', false);
select * from cypher ('graph1', $$
Match(v)
return v
$$) as (v agtype);
v
-------------------------------------------------------------------------------------------------------------
{"id": 844424930131969, "label": "Person", "properties": {"age": "12", "name": "AAA", "__id__": 1}}::vertex
{"id": 844424930131970, "label": "Person", "properties": {"age": "32", "name": "BBB", "__id__": 2}}::vertex
{"id": 844424930131971, "label": "Person", "properties": {"age": "33", "name": "CCC", "__id__": 3}}::vertex
(3 rows)
-----------------------------------------------
2. import person.csv second time, which contains:
name,age
DDD,54,
EEE,66,
FFF,73
select load_labels_from_file('graph1', 'Person', '/home/postgres/ws/testdata/person.csv', false);
select * from cypher ('graph1', $$
Match(v)
return v
$$) as (v agtype);
v
-------------------------------------------------------------------------------------------------------------
{"id": 844424930131969, "label": "Person", "properties": {"age": "12", "name": "AAA", "__id__": 1}}::vertex
{"id": 844424930131970, "label": "Person", "properties": {"age": "32", "name": "BBB", "__id__": 2}}::vertex
{"id": 844424930131971, "label": "Person", "properties": {"age": "33", "name": "CCC", "__id__": 3}}::vertex
{"id": 844424930131969, "label": "Person", "properties": {"age": "54", "name": "DDD", "__id__": 1}}::vertex
{"id": 844424930131970, "label": "Person", "properties": {"age": "66", "name": "EEE", "__id__": 2}}::vertex
{"id": 844424930131971, "label": "Person", "properties": {"age": "73", "name": "FFF", "__id__": 3}}::vertex
(6 rows)
Expected behavior each vertex has an unique graphId
Environment (please complete the following information):
- Version:
- 1.5.0
- PG 11
@wen-bing This problem is usually caused by loading multiple files, or the same file, to the same label.
For example, reusing the same vertices file with the same label will cause duplicates because you are reusing the same vertex ids. Remember, the CSV for vertices contains the vertex id.
Another example is loading multiple edge files to the same label for edges. This is due to how the edges are created from the CSV. Remember, the CSV for edges does not contain the edge id which means it needs to be manually created. Unfortunately, what is created is based on the row in the CSV.
This is an issue that we are currently looking into.
Hope this is helpful.
The main issue is that PostgreSQL AGE does not automatically increment the sequence for graph IDs (graphid) after a bulk import. We would greatly appreciate it if a solution could be provided as soon as possible.
@wen-bing @sintova27 Recent update to the age csv loader (e370db34afb255236a77012882d17e3f396e7f5b) should address the issues you encountered. Please give it a try and let me know if you encounter any other issue.
This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.
This issue was closed because it has been stalled for further 14 days with no activity.