OrientDB-NET.binary icon indicating copy to clipboard operation
OrientDB-NET.binary copied to clipboard

How could we add an edge to existing nodes without loading them first?

Open masimplo opened this issue 10 years ago • 19 comments

Say for example that I have a node Person that has out edges of type Friend that leads to another person as an in edge. If I want to make a friend edge between two persons with the current driver inside a transaction I have to first load Person1 and Person2 (or create Person2 if its a new person) and then add the ORID of each other into the respective collections (talking about light edges, but heavy edges are not that different). This looks fine in the test cases but in a real world scenario I would have to load a person that might have 1000s of items in the list of outgoing or incoming edges and add to that collection which leads to performance issues and concurrency exceptions.

It would be great if you could create edges inside a transaction in a similar manner to "Create edge from #1 to #2" without loading the nodes first. I am not 100% sure but I think the java and nodejs client are already doing this way.

masimplo avatar Jan 21 '15 11:01 masimplo

Good point. I'll take a look and see what I can figure out.

poindextrose avatar Jan 22 '15 01:01 poindextrose

I'm having trouble figuring out how this can be accomplished with the binary protocol which OTransaction is using. The REQUEST_TX_COMMIT documentation only lists three operations: UPDATES, DELETIONS, and CREATIONS.

Maybe I'm missing something? Can you please point to any examples or tests that might be relevant?

It looks like the REST API SQL batch operation can create an edge as you describe inside a transaction. I don't see that implemented in this client. Maybe that would be the route to take in making this happen?

poindextrose avatar Jan 22 '15 05:01 poindextrose

I will take a look at the implementation of this in different clients and let you know since right now it does seem that the binary protocol might not allow this, but I am pretty sure other clients are doing it somehow.

masimplo avatar Jan 23 '15 09:01 masimplo

I checked that and don't how (or if this possible) to implement this via remote protocol. Then using lightweight edges information stored in the vertex on both side. Then using heavy edges information store on both vertex and in the edge. REQUEST_TX_COMMIT has UPDATE operation but need all chain of links because this command change property by new value.

GoorMoon avatar Jan 25 '15 17:01 GoorMoon

Maybe @lvca can shed some light on this, since the java client supports this functionality.

eg.

try{
  Vertex luca = graph.addVertex(null); // 1st OPERATION: IMPLICITLY BEGIN A TRANSACTION
  luca.setProperty( "name", "Luca" );
  Vertex marko = graph.addVertex(null);
  marko.setProperty( "name", "Marko" );
  Edge lucaKnowsMarko = graph.addEdge(null, luca, marko, "knows");
  graph.commit();
} catch( Exception e ) {
  graph.rollback();
}

taken from: https://github.com/orientechnologies/orientdb/wiki/Graph-Database-Tinkerpop

masimplo avatar Jan 27 '15 09:01 masimplo

Somebody called me? :-) In Java we store temporary RIDs for new elements. Then on server side it already manage them correctly. So if the client can generate RID with negative clusterPosition (like #10:-2, #10:-3, etc) and put the RIDs in the right place (vertices/edges), then OrientDB server will do the rest.

lvca avatar Jan 27 '15 11:01 lvca

And after TX is committed, the new RIDs are sent back to let the client to update local instances.

lvca avatar Jan 27 '15 11:01 lvca

Thanks for popping up so fast @lvca. The negative clusterPosition logic is already present in this client. So we can create two nodes with ids #cluster1:-1 and #cluster2:-1 and add #cluster2:-1 into out_hasConnection of node1 and #cluster1:-1 in_hasConnection of node2 and everything works fine. Problem is node1 (say #cluster1:0) already exists and could have 1000s out_hasConnection edges. In order to add node2 and connect it, we would have to load node1 with a list of 1000s orids in out_hasConnection field and add #cluster2:-1 to that list. As time goes by and out_hasConnection becomes bigger and bigger, performance will degrade.

The most efficient way would be to create node2 in memory exactly as we do now and then execute something like "create edge from #cluster1:0 to #cluster2:-1" inside the same transaction if at all possible, thus not loading #cluster1:0 at all.

masimplo avatar Jan 27 '15 12:01 masimplo

Got it: .NET driver has no Bonsai implementation, so edges arrive as a big collection. Mmhm, I don't know if issuing a command in the middle of a TX works with negative RIDs...

lvca avatar Jan 27 '15 12:01 lvca

@lvca, How Bonsai tree would be help in such case ? We need create edge between two existing nodes that one of them already has thousands outgoing edges, inside transaction via remote protocol. Could you please give example.

GoorMoon avatar Jan 27 '15 16:01 GoorMoon

I think the best guys are @tglman and @laa on this. This is the Java class:

https://github.com/orientechnologies/orientdb/blob/5dbce62a0e9e0f96258192d3afafb13df0c0ce41/core/src/main/java/com/orientechnologies/orient/core/index/sbtreebonsai/local/OSBTreeBonsai.java

lvca avatar Jan 27 '15 16:01 lvca

@tglman, @laa,

can you help ?

GoorMoon avatar Jan 27 '15 18:01 GoorMoon

hi @GoorMoon,

the OSBTreeBonsai it help with big collection because you don't download all the collection, but you get parts when you need. In the case you use it, embedded in the document will be just a root of a tree and after from that you can load all the other parts using this operations: http://www.orientechnologies.com/docs/last/orientdb.wiki/Network-Binary-Protocol.html#request_create_sbtree_bonsai

also when you do un update of that tree you keep the changes locally and you send just the changes.

some code details that may help : https://github.com/orientechnologies/orientdb/blob/5dbce62a0e9e0f96258192d3afafb13df0c0ce41/core/src/main/java/com/orientechnologies/orient/core/db/record/ridbag/sbtree/OSBTreeRidBag.java#L746

https://github.com/orientechnologies/orientdb/blob/5dbce62a0e9e0f96258192d3afafb13df0c0ce41/core/src/main/java/com/orientechnologies/orient/core/db/record/ridbag/sbtree/OSBTreeRidBag.java#L862

tglman avatar Jan 28 '15 12:01 tglman

@tglman, Thanks for answering. I still don't understand how this will help in our case ? I understand if i have huge collection of links in vertex (more than threshold Default value: 80) queering or loading this vertex then i get back SBTree with pointer to the root of a tree.

How i can link that vertex to another without loading him first at all in transaction.

For Example:

I have vertex Person and property out_Friends How i can add another friend to Person without loading it first (i know @rid for this vertex) in transaction via remote protocol.

I'd appreciate if you could give examples and even test case (you can also in Java) that can shed light on the case. I need an example of how to do such a thing through the remote protocol.

I apologise for the noise but this is important, thanks

GoorMoon avatar Jan 28 '15 12:01 GoorMoon

@tglman ?

GoorMoon avatar Feb 03 '15 10:02 GoorMoon

Hi @GoorMoon

Sorry for the late reply. The idea behind the SBTree is that if you want to link a record to another, you first load the first record, you get the SBTree root and from that you add a change operation that add the id of the second record. so you need just to load the first record, not both. is it this clear ?

tglman avatar Feb 04 '15 17:02 tglman

@tglman Can we talk privately on the chat ?

GoorMoon avatar Feb 04 '15 17:02 GoorMoon

yes sure: https://gitter.im/tglman

tglman avatar Feb 04 '15 17:02 tglman

Any update regarding this?

nevers avatar Apr 29 '16 06:04 nevers