cypher-for-gremlin
cypher-for-gremlin copied to clipboard
Question: translation result cannot run on Janus Graph
First thanks for your contributions for this useful tool.
I translated cypher with parameters to gremlin script using this tool.
Cypher: MATCH (arch:Architecture{_id: {archId}}) RETURN arch AS arch
Translated gremlin script: g.V().as('arch').hasLabel('Architecture').where(__.choose(__.constant(archId), __.constant(archId), __.constant(' cypher.null')).is(neq(' cypher.null')).as(' GENERATED1').select('arch').values('_id').where(eq(' GENERATED1'))).select('arch').project('arch').by(__.choose(neq(' cypher.null'), __.valueMap(true)))
This gremlin script can run successfully on Neo4j. But after I switched db to JanusGraph, I cannot get result and got this error.
I commit the gremlin script using following code in JavaScript.
client.submit(gremlinScript, params)
And I also created index for _id
property.
Could anyone help with this issue? And what is the best practice for this kind of requirements?
Hello @yuzhebj,
this behavior is related to parameter handling.
To avoid NullPointerException
in case parameter has null
value, parameter is wrapped into null guard .choose(constant(archId), constant(archId), constant(' cypher.null'))
. Thats why Gremlin can not fold .has('_id', ...)
in .V()
step to benefit from index and query is slow.
I agree that this translation is not optimal (although correct) and parameter validation should be configurable.
Until this is implemented, I can only suggest trying a workaround with manual inlining of parameters e.g: MATCH (arch:Architecture{_id: 2}) RETURN arch AS arch
.
Hello @dwitry , thanks very much for your reply.
I remove the parameters, and it works. Also the gremlin script is much shorter than before. 👍
But I still wonder whether there is some static variable or parameter can replace the 'cypher.null' in gremlin?
In addition, in my translation result, I have script like this by(__.constant('_id')).select(values).map(cypherContainerIndex())
and get error
I cannot find any document about it. How can I rewrite this gremlin script? (I have refined my cypher script in several different ways, but still have this cypherContainerIndex() function in gremlin result)
But I still wonder whether there is some static variable or parameter can replace the 'cypher.null' in gremlin?
Not at the moment. If you will use Java null
, the traversal will fail with NPE. This might be improved in TinkerPop 3.5.
(I have refined my cypher script in several different ways, but still have this cypherContainerIndex() function in gremlin result
cypherContainerIndex()
is custom function that allows accessing elements by index when type is unknown (refer to docs for more detailed explanation).
It is included in Extensions and Server Plugin.
Have you installed Server Plugin on your target Gremlin Server?
@yuzhebj can you show Cypher query that translates with cypherContainerIndex()
and produces this error?
@dwitry I do have installed server plugin on my gremlin server for this extension.
I use java project to do the translation task offline (I have added CustomPredicate.java and CustomFunctions.java to my project, version of gremlin translation library is 0.9.13
, because the latest version 1.0.2 has some feature not supported in JanusGraph).
The version of gremlin server is 3.4.1
. I have installed the extension by bin/gremlin-server.sh install org.opencypher.gremlin cypher-gremlin-server-plugin
And I execute gremlin script in a nodejs project.
This is the cypher:
MATCH (instance:TEST {_id: 'instanceId_PARAM'}) OPTIONAL MATCH (instance)-[rel *]->(node) WITH *, CASE WHEN node IS NOT NULL THEN {start: startNode(LAST(rel)), end: endNode(LAST(rel)), type: type(LAST(rel))} ELSE NULL END AS relTmp RETURN COLLECT(DISTINCT {start: relTmp.start._id, end: relTmp.end._id, type: relTmp.type}) AS rels
And this is the translation result:
g.V().as('instance').hasLabel('TEST').has('_id', eq('instanceId_PARAM')).choose(__.select('instance').as(' cypher.path.start.GENERATED3').repeat(__.outE().as('rel').inV()).emit().until(__.path().from(' cypher.path.start.GENERATED3').count(local).is(gte(21))).as('node').optional(__.select(all, 'rel').as('rel')), __.select('instance').as(' cypher.path.start.GENERATED3').repeat(__.outE().as('rel').inV()).emit().until(__.path().from(' cypher.path.start.GENERATED3').count(local).is(gte(21))).as('node').optional(__.select(all, 'rel').as('rel')), __.constant(' cypher.null').as('rel').as('node')).where(__.choose(__.select('node').choose(__.is(neq(' cypher.null')), __.constant(true), __.constant(false)).is(eq(true)), __.project('start', 'end', 'type').by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.outV())).by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.inV())).by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.label().is(neq('vertex')))), __.constant(' cypher.null')).is(neq(' cypher.null'))).select('instance', 'node', 'rel').project('instance', 'node', 'rel', 'relTmp').by(__.select('instance')).by(__.select('node')).by(__.select('rel')).by(__.choose(__.select('node').choose(__.is(neq(' cypher.null')), __.constant(true), __.constant(false)).is(eq(true)), __.project('start', 'end', 'type').by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.outV())).by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.inV())).by(__.select('rel').choose(__.tail(local, 1), __.tail(local, 1), __.constant(' cypher.null')).choose(neq(' cypher.null'), __.label().is(neq('vertex')))), __.constant(' cypher.null'))).as(' GENERATED4').select('instance').as('instance').select(' GENERATED4').select('node').as('node').select(' GENERATED4').select('rel').as('rel').select(' GENERATED4').select('relTmp').as('relTmp').project('start', 'end', 'type').by(__.project(' GENERATED5', ' GENERATED6').by(__.project(' GENERATED7', ' GENERATED8').by(__.identity()).by(__.constant('start')).select(values).map(cypherContainerIndex())).by(__.constant('_id')).select(values).map(cypherContainerIndex())).by(__.project(' GENERATED9', ' GENERATED10').by(__.project(' GENERATED11', ' GENERATED12').by(__.identity()).by(__.constant('end')).select(values).map(cypherContainerIndex())).by(__.constant('_id')).select(values).map(cypherContainerIndex())).by(__.project(' GENERATED13', ' GENERATED14').by(__.identity()).by(__.constant('type')).select(values).map(cypherContainerIndex())).dedup().is(neq(' cypher.null')).fold().project('rels').by(__.identity())
version of gremlin translation library is 0.9.13, because the latest version 1.0.2 has some feature not supported in JanusGraph)
Just to note that JanusGraph 0.4.0
, is compatible with 1.0.0
and even 1.0.2
if you don't need Spark and OLAP capabilities.
And this is the translation result:
So cypherContainerIndex()
index comes from relTmp.start._id
: frontend does not know the type of relTmp
so extraction of start
is done in runtime via custom function. Same goes for extracting _id
from start
.
Considering you've installed the plugin, error message about missing signature of cypherContainerIndex()
looks very strange. Does any other custom functions like toInteger
or toString
work?
If not, may I suggest you to try to import CustomFunctions
explicitly in Gremlin Server configuration:
scriptEngines: {
gremlin-groovy: {
plugins: { ... },
staticImports: ['org.opencypher.gremlin.traversal.CustomFunctions.*']
...
@dwitry Thanks for your reply.
I updated Gremlin Server configuration like what you suggested and restart the gremlin server. But I still have this error... I also tried to install the server plugin in the gremlin server in JanusGraph 0.4.0. And also failed...
I also followed the manual steps and I can see [INFO] OpLoader - Adding the cypher OpProcessor.
after started gremlin-server... Is there any other suggestion?
@yuzhebj can you clarify if any other custom functions like toInteger
or toString
work?
@dwitry I tried toString
function, and it does NOT work via both gremlin console and nodejs client.
The cypher is MATCH (arch:Architecture {_id:'archId'}) RETURN toString(arch.number)
And the gremlin script is g.V().hasLabel('Architecture').has('_id', eq('archId')).project('toString(arch.number)').by(__.choose(neq(' cypher.null'), __.choose(__.values('number'), __.values('number'), __.constant(' cypher.null'))).map(cypherToString()))
I also got this error
I followed this guide to install the gremlin extension (both the automated and manual installation parts)
Ok, still sounds very strange to me, need more details to find cause. Only thing I can suggest is that you post your Gremlin Server configuration yaml here, and I'll try to reproduce this on Monday.
@dwitry this is my gremlin server config yaml. And thanks very much! gremlin-server-yaml.zip
@yuzhebj could you try to add org.opencypher.gremlin.server.jsr223.CypherPlugin: {}
to scriptEngines.gremlin-groovy.plugins
in your yaml
file?
@dwitry I think it works. I will try more gremlin scripts. Thanks so much! 👍👍
@dwitry regarding this comment is it possible to disable parameter validation in current v1.0.4
release? Could the Translator.ParametrizedFlavorBuilder.inlineParameters()
be useful here?
Also, regarding the inlining parameters workaround, this make possible SQL injection vulnerability