morpheus icon indicating copy to clipboard operation
morpheus copied to clipboard

Deleting a graph from PropertyGraphDataSource

Open goshaQ opened this issue 4 years ago • 2 comments

The deleteGraph method doesn't take into account created constraints on the metaLabel. Problem is that DETACH DELETE doesn't remove associated constraints and indexes. That causes an error when a PropertyGraph with the same name is deleted and written back again. See the following example:

val neo4jSource = GraphSources.cypher.neo4j(neo4jConfig)
val name = GraphName("arbitraryGraph")
neo4jSource.store(name, graph)
neo4jSource.delete(name)
neo4jSource.store(name, graph)

That causes the following exception:

Exception in thread "main" org.opencypher.okapi.impl.exception.GraphAlreadyExistsException: A graph with name arbitraryGraph is already stored in this graph data source.

Moreover, it doesn't allow write a PropertyGraph with entireGraphName, which makes no sense to me. What if I would like just store everything to the database and never restore that PropertyGraph (so metaLabel and related properties, i.e. ___morpheusID are not desired to be saved), because I have other data sources and graph database is the destination of the processed data?

goshaQ avatar Aug 19 '19 09:08 goshaQ

Hello @goshaQ and thanks for reaching out to us.

I agree that the store-delete-store probably should be reconsidered in the scenario you describe. However, it isn't exactly the expected way of working with the PGDSs, as storing a graph typically is a costly operation. Could you tell us a little bit more about your use case for this order of operations?

In the current design, we require a fixed way of viewing the entirety of a Neo4j database as a graph, and we must choose a name for this. This name may then not be used by any subgraph, because if we allowed that it wouldn't be clear which one you would get when referencing the name; the union of all the subgraphs (e.g. entire graph) or just that one subgraph. In order to preserve the notion of the Neo4j database as one large graph, we need to reserve a name for this purpose.

In summary, I would agree with the following changes:

  • Remove the metaProperty (e.g. ___morpheusID) similar to what Neo4jGraphMerge does, because this is only used as temporary information in the between-state of creating nodes and relationships in the Neo4j database
  • Delete indexes/constraints when the graph is deleted, to not fail when writing another graph with a deleted graph's name

Does this sound sensible to you?

Mats-SX avatar Sep 26 '19 10:09 Mats-SX

Thanks for your reply @Mats-SX !

I discovered the problem when experimented with the final export stage of the processing pipeline. In particular, the exception appeared during integration test, where a small amount of enriched data is expected to be written into Neo4j and at the end of the test removed.

The listed changes sound perfect to me!

goshaQ avatar Sep 28 '19 12:09 goshaQ