mongodb-gremlin icon indicating copy to clipboard operation
mongodb-gremlin copied to clipboard

A MongoDB compiler for the Gremlin traversal machine.

image::https://raw.githubusercontent.com/okram/mongodb-gremlin/73fb15f74d23a544d224e6a1f66e746bd3329e31/docs/images/mongodb-gremlin-logo.png[MongoDB-Gremlin]

MongoDB-Gremlin

MongoDB-Gremlin compiles MongoDB link:https://docs.mongodb.com/manual/tutorial/insert-documents/[insert] and link:https://docs.mongodb.com/manual/tutorial/query-documents/[query] documents to link:http://tinkerpop.apache.org/gremlin.html[Gremlin] bytecode. In doing so, JSON documents are embedded in the underlying link:http://tinkerpop.apache.org[Apache TinkerPop]-enabled graph system.

This project currently serves as a proof-of-concept and should not be used in any real, production capacity. It was created to provide a technical foundation for the link:https://www.datastax.com/dev/blog/the-von-gremlin-architecture[The Von Gremlin Architecture] article. If anyone is interested in taking over this project and flushing out the remaining CRUD operations of MongoDB (e.g. delete and update), please feel free to do so. There are a few peculiarities that were introduced for the sake of providing a quick POC. These should be rectified in a more thorough implementation and can be discussed with the interested developer.

NOTE: Please see link:https://github.com/twilmes/sql-gremlin[SQL-Gremlin] and link:https://github.com/dkuppitz/sparql-gremlin[SPARQL-Gremlin] for other examples of using common data structures and query languages with Gremlin.

JSON Graph Encoding


The JSON-to-graph encoding is as follows:

1. Every JSON object is a vertex.
2. Every JSON primitive field is a vertex property.
3. Every JSON array must either be all primitives or all objects.
** Primitive arrays are represented using vertex multi-properties.
** Object arrays are represented using edges to vertices.
4. JSON arrays may not contain JSON arrays.

It is important to note that appropriate graph-based indices should be defined to enable quick
retrieval based on the abstract structure defined above.

Example Usage
~~~~~~~~~~~~~

[source,groovy]
----
~/software/tinkerpop bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> :install com.datastax.tinkerpop mongodb-gremlin 0.1-SNAPSHOT
==>Loaded: [com.datastax.tinkerpop, mongodb-gremlin, 0.1-SNAPSHOT]
gremlin> import com.datastax.tinkerpop.mongodb.MongoDBTraversalSource
==>...
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> db = graph.traversal(MongoDBTraversalSource)
==>mongodbtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> db.insertOne("""
......1> {
......2>   "~id" : "71223bf3-9dcc-4de1-b95a-13fdb8aba9e0",
......3>   "~label" : "person",
......4>   "name" : "Gremlin",
......5>   "hobbies" : ["traversing", "reflecting"],
......6>   "birthyear" : 2009,
......7>   "alive" : true,
......8>   "languages" : [
......9>     {
.....10>       "name" : "Gremlin-Java",
.....11>       "language" : "Java8"
.....12>     },
.....13>     {
.....14>       "name" : "Gremlin-Python",
.....15>       "language" : "Python"
.....16>     },
.....17>     {
.....18>       "name" : "Ogre",
.....19>       "language" : "Clojure"
.....20>     }
.....21>   ]
.....22> }
.....23> """)
==>[acknowledged:true,insertId:71223bf3-9dcc-4de1-b95a-13fdb8aba9e0]
gremlin> db.find('{"name" : "Gremlin"}')
==>[~id:71223bf3-9dcc-4de1-b95a-13fdb8aba9e0,
    ~label:person,
    birthyear:2009,
    alive:true,
    hobbies:[traversing,reflecting],
    name:Gremlin,
    languages:
     [[~label:vertex,~id:2,name:Gremlin-Java,language:Java8],
      [~label:vertex,~id:6,name:Gremlin-Python,language:Python],
      [~label:vertex,~id:10,name:Ogre,language:Clojure]]]
----

Finally, it is important to note that because the JSON document is embedded in the graph as vertices, edges, and properties,
it is possible to use `GraphTraversal` to query the graph structure as well.

[source,groovy]
----
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:4 edges:3], standard]
gremlin> g.V().has('name','Gremlin').out('languages').has('name','Ogre').values('language')
==>Clojure
----