morpheus
morpheus copied to clipboard
Deleting a graph from PropertyGraphDataSource
The deleteGraph method doesn't take into account created constraints on the metaLabel. Problem is that DETACH DELETE
doesn't remove associated constraints and indexes. That causes an error when a PropertyGraph with the same name is deleted and written back again. See the following example:
val neo4jSource = GraphSources.cypher.neo4j(neo4jConfig)
val name = GraphName("arbitraryGraph")
neo4jSource.store(name, graph)
neo4jSource.delete(name)
neo4jSource.store(name, graph)
That causes the following exception:
Exception in thread "main" org.opencypher.okapi.impl.exception.GraphAlreadyExistsException: A graph with name arbitraryGraph is already stored in this graph data source.
Moreover, it doesn't allow write a PropertyGraph with entireGraphName
, which makes no sense to me. What if I would like just store everything to the database and never restore that PropertyGraph (so metaLabel and related properties, i.e. ___morpheusID
are not desired to be saved), because I have other data sources and graph database is the destination of the processed data?
Hello @goshaQ and thanks for reaching out to us.
I agree that the store-delete-store probably should be reconsidered in the scenario you describe. However, it isn't exactly the expected way of working with the PGDSs, as storing a graph typically is a costly operation. Could you tell us a little bit more about your use case for this order of operations?
In the current design, we require a fixed way of viewing the entirety of a Neo4j database as a graph, and we must choose a name for this. This name may then not be used by any subgraph, because if we allowed that it wouldn't be clear which one you would get when referencing the name; the union of all the subgraphs (e.g. entire graph) or just that one subgraph. In order to preserve the notion of the Neo4j database as one large graph, we need to reserve a name for this purpose.
In summary, I would agree with the following changes:
- Remove the
metaProperty
(e.g.___morpheusID
) similar to whatNeo4jGraphMerge
does, because this is only used as temporary information in the between-state of creating nodes and relationships in the Neo4j database - Delete indexes/constraints when the graph is deleted, to not fail when writing another graph with a deleted graph's name
Does this sound sensible to you?
Thanks for your reply @Mats-SX !
I discovered the problem when experimented with the final export stage of the processing pipeline. In particular, the exception appeared during integration test, where a small amount of enriched data is expected to be written into Neo4j and at the end of the test removed.
The listed changes sound perfect to me!