Implementing new features

Open kefirbandi opened this issue 11 years ago • 10 comments

I am working on a project, for which I found bulbs to be very useful. There are however a few features, which I miss, such as

1, checking the equivalence of two elements:

if myvertex1.equivalent(myvertex2): ....

2, being able to execute a previously defined custom gremlin step from my vertex (or edge), such as:

myfriends = myvertex1.custom('friends')

and many others ...

I have implemented the above two features in my fork, and I would like to get them included in the master. However, before sending the pull-request I would like to know your opinion and get some coding guidelines to make sure the code fits the philosophy.

Since I plan to use bulbs in the project I am planning to contribute more in the future for the benefit of everyone, including myself.

Jan 21 '13 12:01 kefirbandi

You can compare two elements with ==

Notice the equivalence operator is overloaded in element, along with a few others:

https://github.com/espeed/bulbs/blob/master/bulbs/element.py#L232

Regarding #2, Bulbs isn't a Gremlin implementation so all Gremlin is executed via Gremlin-Groovy scripts, which you can reused if you store them in your scripts library via the scripts.update() method.

See http://bulbflow.com/docs/api/bulbs/gremlin/

James

On Mon, Jan 21, 2013 at 6:25 AM, kefirbandi [email protected]:

I am working on a project, for which I found bulbs to be very useful. There are however a few features, which I miss, such as

1, checking the equivalence of two elements:

if myvertex1.equivalent(myvertex2): ....

2, being able to execute a previously defined custom gremlin step from my vertex (or edge), such as:

myfriends = myvertex1.custom('friends')

and many others ...

I have implemented the above two features in my fork, and I would like to get them included in the master. However, before sending the pull-request I would like to know your opinion and get some coding guidelines to make sure the code fits the philosophy.

Since I plan to use bulbs in the project I am planning to contribute more in the future for the benefit of everyone, including myself.

— Reply to this email directly or view it on GitHubhttps://github.com/espeed/bulbs/issues/74.

Bulbflow: A Python framework for graph databases (http://bulbflow.com)

Jan 24 '13 03:01 espeed

Hi James,

1:

the == operator (although I didn't find it first), as implemented currently is a bit different from what I need.

Currently it is:

element.__class__  == self.__class__ and
element._id == self._id and
element._data == self._data)

and what I need is:

element.__class__  == self.__class__ and
element._data == self._data)

I think it is a handy feature and if it is implemented I believe the right place for it is in the element class.

2:

What I'm looking for here is really an extension of the myvertex1.outV kind of semantics. So (assuming everything is set up), It is more intuitive to write

v1.custom('friend')

(or maybe even v1.friend) than

g.gremlin.query(friend_step_script, dict(id=v1.eid))

Both of these features could be implemented at higher levels e.g. in classes derived from the Node and Relationship classes. But since both features are common to both classes I think their place is in the common base class.

Jan 24 '13 09:01 kefirbandi

I like the idea of making it easy to call custom Gremlin scripts via Python methods.

The guys at SHIFT just created a Titan-only client for Python that has this feature. See https://github.com/StartTheShift/thunderdome/wiki/Gremlin for inspiration.

https://github.com/StartTheShift/thunderdome/blob/master/thunderdome/gremlin.py

Feb 05 '13 11:02 espeed

Would it create any copyright issues if parts of that code would be used in bulbs? It seems that the thunderdome.gremlin module could be used without modification, and the ModelMeta class in bulbs would need to be modified to be prepared for thunderdome.BaseGremlinMethod type of attributes. Shall the thunderdome.gremlin module be copy&pasted into bulbs, or rather import-ed from thunderdome, effectively making thunderdome a pre-requisite for bulbs?

Mar 08 '13 15:03 kefirbandi

About calling custom Gremlin scripts: There are two ways to do it, and both make sense: 1, mapping Python methods to Gremlin scripts , like in case of thunderdome 2, having a .custom(self,string,dictionary={}) method in Nodes and Relationships

The first is more elegant, but the second is very handy for on-the-fly custom steps, like node.custom("outE('a').inV('b')").

No need to look up definition, when viewing the code
Less overhead when writing the code
No need to modify original class definition to use this kind of semantics.

So I believe there is room for both of these.

Mar 08 '13 15:03 kefirbandi

Hi Andras -

It looks like Thunderdome is MIT so that's pretty flexible. The authors started off using Bulbs-Neo4j and then moved to Bulbs-Titan (which isn't fully backed yet) before writing Thunderdome (with is Titan specific). You can see ideas from of Bulbs in Thurderdome (e.g. the sourcing of Gremlin scripts was from Bulbs -- Thuderdome made it model specific) so I don't think they'd mind if we borrowed some of their good ideas. Thunderdome and Bulbs design philosophy differ in some ways so approaches to feature implementations will probably differ also.

For example, Bulbs supports multiple backends and can connect to multiple databases by creating multiple Graph objects -- the Thurderdome team made the design choice to only support Titan and one DB at at time via a config file, while Bulbs' config is set via an object. In fact, the Thunderdome 0.3 design is how the original Bulbs 0.2 was designed years ago when Bulbs only supported Rexster, before Neo4j Server had support for Gremlin, and well before Titan was on the scene.

Bulbs 0.3 added support for Neo4j Server and did away with the Django-esque settings.py file in favor of a Config object, which made for a much cleaner design and the ability to support and connect to multiple DBs by using multiple graph objects. And using a graph object is more inline with how Blueprints/Gremlin is designed so it helps keeps things consistent.

And once you craft a custom Graph object in Bulbs, that's the only thing you ever have to import because all your models and everything you need is self contained within it. This makes it simple in the REPL and in your app because instead of having to import all the models separately, and having them scattered across multiple files, you just import the Graph object which contains everything.

Bulbs went from 0.2 to 0.3 when I added support for Neo4j Server, but I only bumped it up to 3.9 when I added support for Titan because Titan is changing so fast it's hard to tell right now what will be the best approach. Right now, Bulbs-Titan is based on Bulbs-Rexster, with a few changes to indexing.

Note that indexing is something that differs across Rexster, Neo4j Server, and Titan so rather than try to come up with a general way to do indexing for each, the Bulbs adapter is designed so you can define custom indexes depending on the back end.

Another design choice that differs between Bulbs and Thuderdome is that Thunderdome opted for a Gremlin-only interface rather than supporting the full Rexster-Titan REST client. This simplifies the code, and Bulbs makes heavy use of Gremlin so I can understand their decision to go this route. But right now there are some performance issues with the Gremlin plugin that need to be worked out before I would drop the REST API completely and go full Gremlin.

For example, the Gremlin plugin has to restart every 500 requests otherwise you'll get PermGen errors, and the plugin is still using the JSR 233 script engine rather than the native GroovyScriptEngine. We have known about these issues for a while, and I have talked to blackdrag, the Groovy project lead about this. He implemented a fix a few weeks back, but it hasn't made it into the plugin yet -- it would be best if we swapped out the JSR 223 script engine for the native GroovyScriptEngine, but that's a big change and is still on the table.

With that said, I like the auto-Python wrapper in your first approach, but Thunderdome's MetaClass stores data differently than Bulbs related to the reasons stated above, so I'm not sure a simple drop-in will work, although I haven't looked at it closely.

In addition, Neo4j 2.0 is about to come out, which will include the new, long-awaited indexing framework. This is important because Neo4j no longer offers Gremlin on the Neo4j Heroku Add On, even though Cypher doesn't support indexes and Gremlin is the only way to do transactions with indexed content, so yanking Gremlin from Heroku effectively broke Bulbs for Neo4j-Heroku.

And there has been no way to fix this in Bulbs because Cypher simply doesn't support indexing -- the Cypher authors think indexing is out of scope for Cypher and it should be handled at a lower level so we've been waiting in limbo for the indexing framework to come out. Now that it's finally coming out, hopefully we can make Bulbs work with Neo4j-Heroku again.

Also, next week RexPro is being released, which is a binary protocol for Rexster and may eventually obsolete Rexster REST. This means we will need to convert the Rexster adapter to use RexPro.

All this to say, I've been watching and anticipating these changes for months so I've held back making any drastic design changes to Bulbs until these things happen. By some good fortune, they are all about to hit at the same time so we will have the opportunity to factor all of them into the design of Bulbs's next major release (Bulbs 0.4).

RexPro should simplify the Rexster and Titan adapters, in addition to providing a massive performance boost. The Neo4j new indexing-platform is a major overhaul and a move to auto-indexing so it will necessitate a Neo4j-indexing overhaul in Bulbs, which may have implications for how Bulbs Models handle auto-indexing.

All that to say, there are some major, long-awaited changes coming for all three backends so before we invest too much time in the design of top-level features, it may make sense to implement the low-level adapter changes to see how stuff like Neo4j autoindexing affects the design of indexing in the models, and how RexPro affects our approach to Gremlin.

Mar 09 '13 01:03 espeed

Hi James,

Feel free to repurpose any thunderdome code that is useful to you!

Blake & Jon & Eric

Mar 13 '13 00:03 bdeggleston

Thanks Blake -

The way you guys did model-specific Gremlin methods is really slick. And the work you and Stephen did on RexPro is going to take Rexster to a whole new level -- exciting times in TinkerPop land :)

James

Mar 13 '13 01:03 espeed

James,

I have to admit that my neo4j or titan specific experience is close to nothing. This includes indexing as well. I always go through python->bulbs->rexster, occasionally using the rexster console and writing simple gremlinGroovy scripts. So I would stay on the Python-end of things and trying to give general solutions to the problems which I face as a user, including the ones mentioned in the previous comments. As an example I now implemented a generic solution to handle a ManyToOne or OneToOne relationship as a class variable Example:

class NodeA(Node):
    element_type = 'nodea'
    something = Float()

class Father(Relationship):
    label = 'father'

class NodeB(Node):
    element_type = 'nodeb'
    value = Float()
    father = XtoOne('father')

#the usual set up steps here
....


B=g.NodeB.create()
A=g.NodeA.create()
B.father = A  #sets up a link labelled 'father' from B to A
assert(B.father  == A)

For these things I try not to rely on the implementation of lower levels, but we will see how well they survive the restructuring.

Mar 14 '13 14:03 kefirbandi

One thing I plan on implementing in the 0.4 Relationship model is a way to provide a list of on acceptable in-bound, out-bound, or bidirectional Nodes so that the Relationship model can verify the relationship is allowed.

May 22 '13 10:05 espeed

bulbs bulbs copied to clipboard

Implementing new features

1:

2:

bulbs
bulbs copied to clipboard