orientdb-labs icon indicating copy to clipboard operation
orientdb-labs copied to clipboard

[OEP 2] Unified Multi-Model (Document/Graph) API and class hierarchy

Open luigidellaquila opened this issue 8 years ago • 25 comments

Reference:

https://github.com/orientechnologies/orientdb-labs/blob/master/OEP_2.md

Summary:

Review the class hierarchy of both Document and Graph APIs to create a unified hierarchy and a unified Multi-Model API.

Goals:

  • Finally have a single, Multi-Model API
  • Better interoperability between Document and Graph API
  • Have a single way to use all the OrientDB functionalities with a single API
  • Make it easier to choose the API (no choice at all)
  • Make it easier to write docs for the API (no more duplicated docs for document and graph)

Non-Goals:

  • deprecate or remove any of the existing data models (Document vs. Graph)

Motivation:

In OrientDB 2.2, Document and Graph APIs are distinct and very different (see doc.field() and vertex.getProperty()). This creates the following problems:

  • New users have a hard time choosing the API to start with
  • Query results sometimes return a mix of the two models, so it's hard to write consistent code
  • We have to write an maintain duplicate docs, one of each API
  • We lack the core requirement of a Multi-Model database: a unified Multi-Model API

Description:

Now we have two separate class hierarchies, one for ODocument and one for OrientElement, where the only super-interface is OIdentifiable, but it obviously lacks all the methods for the domain manipulation.

The proposal is following:

  • Create a common interface T for both ODocument and OrientElement, with methods to
    • get/set properties
    • access and manipulate metadata (OClass)
  • Have two sub-interfaces of this interface, one for vertices (T1) and one for edges (T2), with specific methods for graph traversal (getEdges(), getVertices(), or maybe out()/in()?)
  • Create a new implementation of O_Vertex and O_Edge in the Core module and review their implementation to extend what currently is ODocument. Let OrientVertex and OrientEdge extend this new implementation.
  • Unify the implementation of all the hierarchy, to use the same methods for property and schema manipulation
  • Refactor ODatabase interface to
    • let all the methods return T or Generics<T>
    • implement methods to instantiate and retrieve both T and T1/T2 explicitly
  • See if it's possible to refactor OrientGraph to implement both ODatabase and TinkerPop Graph.

Alternatives:

  • only create the interface hierarchy but avoid to have graph classes extend ODocument class

Risks and assumptions:

  • collision with method signatures of other standards (eg. next TinkerPop versions)

Impact matrix

  • [ ] Storage engine
  • [ ] SQL
  • [ ] Protocols
  • [ ] Indexes
  • [ ] Console
  • [x] Java API
  • [ ] Geospatial
  • [ ] Lucene
  • [ ] Security
  • [x] Hooks
  • [ ] EE

luigidellaquila avatar Jun 20 '16 11:06 luigidellaquila

I agree with this enhancement. Unifying the different models' API will make orientdb more comprehensible and usable for users. Moreover IMHO it should increase the perception of working with a real multi-model database. I've never had completely this impression when I was a simple orientdb user.

G4br13l3 avatar Jun 20 '16 13:06 G4br13l3

I agree on this, i would just add some bits, we should have our graph api not dependent to the ThinkerPop API, this should be native and optimized for our Multi-Model cases, anyway the layer of mapping betwheen our API and the ThinkerPop should be as thin as possible, meaning that the behaviour of the api should be really similar.

In genera in term of api i agree on the specified hierarchy, going on the specific i would like to have this api: ODocument:

  • boolean isVertex()
  • OVertex asVertex()
  • boolean isEdge()
  • OEdge asEdge()

OVertex:

  • Set<OVertex> out(String label)
  • Set<OVertex> in(String label)
  • Set<OEdge> outE(String label)
  • Set<OEdge> inE(String label)
  • Set<OVertex> both(String label)
  • Set<OEdge> bothE(String label)
  • OEdge connect(String label, OVertex target)
  • boolean disconnect(OEdge edge) //to evaluate
  • boolean disconnect(String label, OVertex target) //to evaluate

OEdge:

  • OVertex out()
  • OVertex in()

tglman avatar Jun 20 '16 14:06 tglman

I don't like "connect/disconnect" very much as a name, but the concept is excellent

luigidellaquila avatar Jun 20 '16 14:06 luigidellaquila

probably we can stick to the ThinkerPop/Sql like "createEdge" "deleteEdge"

tglman avatar Jun 20 '16 15:06 tglman

Implementation proposal

New interfaces and classes

interface OElement extends OIdentifiable {
  public void setProperty(String name, Object val);
  public Object getProperty(String name);
  public void delete();
  public void save();
}
interface OVertex extends OElement {
  public Collection<OEdge> getEdges(Direction direction);
  public Collection<OEdge> getEdges(Direction direction, String... type);
  public Collection<OEdge> getEdges(Direction direction, OClass... type);

  public Collection<OVertex> getVertices(Direction direction);
  public Collection<OVertex> getVertices(Direction direction, String... type);
  public Collection<OVertex> getVertices(Direction direction, OClass... type);

  public OEdge addEdge(OVertex to);
  public OEdge addEdge(OVertex to, String type);
  public OEdge addEdge(OVertex to, OClass type);
}
interface OEdge extends OElement {
  public OVertex getFrom();
  public OVertex getTo();
}
class ODocument extends OElement {
   ...
}
class OVertexImpl extends ODocument implements OVertex {
   ...
}
class OEdgeImpl extends ODocument implements OEdge {
   ...
}

Open questions:

  • Should OElement extend OIdentifiable? What about result sets (see OEP_5)?
  • In case we decide to have OElement that does not extend OIdentifiable, should we still have save() and delete() methods? Do we need another interface in the middle?
  • Add asVertex() and asEdge() methods to OElement?

luigidellaquila avatar Jun 21 '16 16:06 luigidellaquila

from my point of view, no save, no detete, you should use directly the db. +1 asVertex, asEdge on the top. i prefer in + out version, but not a requirement ;)

tglman avatar Jun 21 '16 16:06 tglman

+1 for db.save() db.delete() We have to define the db interface as well, and the expected behaviors. Eg. how does a db.delete(vertex) affect the POJOs? Does it also disconnect the edge pointers of the connected edges?

luigidellaquila avatar Jun 21 '16 17:06 luigidellaquila

New proposal for OElement:

interface OElement extends OIdentifiable {
  public void setProperty(String name, Object val);
  public Object getProperty(String name);

  public Optional<ODocument> asDocument();
  public Optional<OVertex> asVertex();
  public Optional<OEdge> asEdge();

  public boolean isDocument();
  public boolean isVertex();
  public boolean isEdge();

  public Optional<OClass> getType();
}

luigidellaquila avatar Jun 21 '16 17:06 luigidellaquila

From last discussion with @tglman:

OElement and OIdentifiable should remain two different hierarchies, because result sets could return projections, that are OElement instances, but do not have an identity.

interface OElement {
  public void setProperty(String name, Object val);
  public Object getProperty(String name);

  public Optional<ODocument> asDocument();
  public Optional<OVertex> asVertex();
  public Optional<OEdge> asEdge();

  public boolean isDocument();
  public boolean isVertex();
  public boolean isEdge();

  public Optional<OClass> getType();
}

luigidellaquila avatar Jun 22 '16 07:06 luigidellaquila

Why isDocument()? All the OElement are implicitly documents. Or do you mean returning true when is nor edge neither vertex? How do you know if this is a projection?

lvca avatar Jun 22 '16 12:06 lvca

@lvca right, there is something missing here, the isDocument() is related to

https://github.com/orientechnologies/orientdb-labs/issues/5

The idea is that a result set will return OResult, that is an OElement but not and ODocument, so that you cannot save it directly (as you could see there is already a lot of discussion on the other thread about this ;-) )

Luigi

luigidellaquila avatar Jun 22 '16 13:06 luigidellaquila

I don't like the isDocument/Vertex/Edge. If I'm executing a query like select from V, I'm expecting to have vertices, so I'd prefer in the result set there are OVertex instances. Calling asVertex() every time I want the actual vertex is overkill. I liked more the first proposal, but I don't think OVertexImpl and OEdgeImpl should extend ODocument. They could be implemented in a totally different way, where the properties could be wrapped in a Document like structure, but the edges could be stored in RAM in different way?

lvca avatar Jun 22 '16 16:06 lvca

Hi @lvca

The only point of this OEP is to have a common hierarchy between Document and Vertex/Edge, use a common API and avoid encapsulation. I think this hierarchy should be in the Core, then single TinkerPop implementations can extend these classes or can encapsulate them (like current TP API does with ODocument).

The initial proposal was to have an OElement in the result set, so that, if you are sure it's an OVertex, you can just cast it and use it as is

luigidellaquila avatar Jun 22 '16 16:06 luigidellaquila

Actually there is no need for OVertex to extend ODocument, I just need it to implement OElement and OIdentifiable. Users should only use interfaces in next releases, so no ODocument anymore in user code ;-)

luigidellaquila avatar Jun 22 '16 16:06 luigidellaquila

I like the OElement and the fact we can break the graph api to be in the core (finally!), but the method asXXX are useless: if the OElement.isVertex() == true, then I can just cast it to OVertex, right?

lvca avatar Jun 22 '16 16:06 lvca

yep, this was the initial proposal and IMHO it's acceptable.

I think @tglman has something in mind about lazy fetching...

luigidellaquila avatar Jun 22 '16 16:06 luigidellaquila

i would say semantically this :

OVartex v = element.asVertex();
OVertex v = (OVertex) element;

are exactly the same thing.

if the element it actually come from a query the second one it can be no true. all the other load case as lazy loading on get i would say the two case are exactly the same. so linking myself to the other OEP: OResult shoud have asVertex() OElement(doc,vec,edge) itself is not a requirement, is more uniformity

tglman avatar Jun 22 '16 16:06 tglman

Agree 100% on unified interface. As I am trying to learn OrientDB I have found it confusing to have these two models (Doc vs Graph) and to try and figure out when do I use one vs the other and the pros/cons of each.

francisco1844 avatar Jun 26 '16 19:06 francisco1844

Additional proposal, always related to the APIs, but this time about the DB.

I'd like to add to ODatabase the following methods:

public OClass createClass(String className, String... superclasses);
public OClass createClassIfNotExists(String className, String... superclasses);
public OClass getClass(String className, String... superclasses);

public OClass createVertexClass(String className);
public OClass createEdgeClass(String className);

They just have to delegate to the schema, so they can be implemented as default methods in the interface itself.

The ratio behind it is to give users a shortcut for very common operations.

The first thing a user does in a DB is to create a class, now he has to do

OSchema schema = db.getMetadata().getSchema();
if(!schema.existsClass(className)){
   schema.createClass(className);
}

With the new API it would just be

db.createClassIfNotExists(className);

WDYT?

Thanks

Luigi

luigidellaquila avatar Jul 05 '16 16:07 luigidellaquila

So what about the index API?

lvca avatar Jul 05 '16 16:07 lvca

I was waiting for this question ;-) Indexes are mainly related to classes, so IMHO it's OK to get a Class from the DB and use it to manipulate its indexes. Everything is clear and simple, every component has its own responsibility:

  • classes belong to the DB
  • indexes belong to classes

Manual indexes will be considered an advanced feature, so it's OK for me to let them accessible only from the metadata objects, without cluttering the basic db interfaces.

WDYT?

Thanks

Luigi

luigidellaquila avatar Jul 06 '16 08:07 luigidellaquila

Hi,

Yes i had every time the same feeling about the creation/manipulation of the classes, and my problem has been that we will end up cluttering to much the database API, anyway i pro if we can find the right balance from what is in the shortcut and what in 'metadata'.

so for me ok to add on the database API shortcut for class manipulation and maybe user manpulation, and keep index, functions, scheduler, and advanced security(roles ecc) in metadata.

so i would add as well:

void dropClass(String className);

and maybe some api like:

OUser addUser(String username);
void dropUser(String username);

but that's it.

tglman avatar Jul 06 '16 11:07 tglman

Hi guys, Ok for me. And this is the same reason why I put basic class methods in OrientBaseGraph.

lvca avatar Jul 06 '16 13:07 lvca

Just pushed the first version of this multi-model API to develop branch

luigidellaquila avatar Nov 03 '16 16:11 luigidellaquila

Hi,

This has been fully implemented since 3.0.x so we can consider this closed.

Regards

tglman avatar May 24 '19 10:05 tglman