openCypher
openCypher copied to clipboard
Problem parsing oC_MultiPartQuery with ANTLR4
Hi to all,
I am using the ANTLR4 grammar in version M16 (https://s3.amazonaws.com/artifacts.opencypher.org/M16/Cypher.g4) and have an issue with oC_MultiPartQuery
:
In the grammar, it says:
oC_MultiPartQuery
: ( ( oC_ReadingClause SP? )* ( oC_UpdatingClause SP? )* oC_With SP? )+ oC_SinglePartQuery ;
To my understanding, it is therefore possible to write something like this
MATCH (c:Label), (d:Label2) SET c.property = d.property WITH c MATCH (e:Label3 {name: c.name}) RETURN c
However, in the ANTLR4 parser (Java), i only get List<oC_ReadingClause>
, List<oC_UpdatingClause>
and List<oC_With>
. Therefore, I cannot distinguish whether (e:Label3 {name: c.name})
belongs before which entry in List<oC_With>
or after any entry.
Therfore, I propose changing the grammar to something like
oC_MultiPartQuery
: ( oC_MultiPartQueryGroup )+ oC_SinglePartQuery ;
oC_MultiPartQueryGroup
: ( oC_ReadingClause SP? )* ( oC_UpdatingClause SP? )* oC_With SP? ;
I am no expert in grammars so please excuse me if I missed something here.
All the best, Daniel
Hi Daniel,
thanks for your question and for using the openCypher artifacts.
I understand, you have some trouble consuming the parsing result from Antlr. I am not an Antlr expert by any means. According to my thin understanding, the parser gives you a parse tree back, which contains all the necessary information. The tree knows the order in which the grammar rules have parse which parts of the input. This information clear allows to decide to which MATCH clause a pattern belongs and if that MATCH clause appears before or after a WITH clause.
A sample output of the parse tree for your query looks like this:
oC_Cypher
oC_Statement
oC_Query
oC_RegularQuery
oC_SingleQuery
oC_MultiPartQuery
oC_ReadingClause
oC_Match
"MATCH"
" "
oC_Pattern
oC_PatternPart
oC_AnonymousPatternPart
oC_PatternElement
oC_NodePattern
"("
oC_Variable
oC_SymbolicName
"c"
oC_NodeLabels
oC_NodeLabel
":"
oC_LabelName
oC_SchemaName
oC_SymbolicName
"Label"
")"
","
" "
oC_PatternPart
oC_AnonymousPatternPart
oC_PatternElement
oC_NodePattern
"("
oC_Variable
oC_SymbolicName
"d"
oC_NodeLabels
oC_NodeLabel
":"
oC_LabelName
oC_SchemaName
oC_SymbolicName
"Label2"
")"
" "
oC_UpdatingClause
oC_Set
"SET"
" "
oC_SetItem
oC_PropertyExpression
oC_Atom
oC_Variable
oC_SymbolicName
"c"
oC_PropertyLookup
"."
oC_PropertyKeyName
oC_SchemaName
oC_SymbolicName
"property"
" "
"="
" "
oC_Expression
oC_OrExpression
oC_XorExpression
oC_AndExpression
oC_NotExpression
oC_ComparisonExpression
oC_AddOrSubtractExpression
oC_MultiplyDivideModuloExpression
oC_PowerOfExpression
oC_UnaryAddOrSubtractExpression
oC_StringListNullOperatorExpression
oC_PropertyOrLabelsExpression
oC_Atom
oC_Variable
oC_SymbolicName
"d"
oC_PropertyLookup
"."
oC_PropertyKeyName
oC_SchemaName
oC_SymbolicName
"property"
" "
oC_With
"WITH"
oC_ProjectionBody
" "
oC_ProjectionItems
oC_ProjectionItem
oC_Expression
oC_OrExpression
oC_XorExpression
oC_AndExpression
oC_NotExpression
oC_ComparisonExpression
oC_AddOrSubtractExpression
oC_MultiplyDivideModuloExpression
oC_PowerOfExpression
oC_UnaryAddOrSubtractExpression
oC_StringListNullOperatorExpression
oC_PropertyOrLabelsExpression
oC_Atom
oC_Variable
oC_SymbolicName
"c"
" "
oC_SinglePartQuery
oC_ReadingClause
oC_Match
"MATCH"
" "
oC_Pattern
oC_PatternPart
oC_AnonymousPatternPart
oC_PatternElement
oC_NodePattern
"("
oC_Variable
oC_SymbolicName
"e"
oC_NodeLabels
oC_NodeLabel
":"
oC_LabelName
oC_SchemaName
oC_SymbolicName
"Label3"
" "
oC_Properties
oC_MapLiteral
"{"
oC_PropertyKeyName
oC_SchemaName
oC_SymbolicName
"name"
":"
" "
oC_Expression
oC_OrExpression
oC_XorExpression
oC_AndExpression
oC_NotExpression
oC_ComparisonExpression
oC_AddOrSubtractExpression
oC_MultiplyDivideModuloExpression
oC_PowerOfExpression
oC_UnaryAddOrSubtractExpression
oC_StringListNullOperatorExpression
oC_PropertyOrLabelsExpression
oC_Atom
oC_Variable
oC_SymbolicName
"c"
oC_PropertyLookup
"."
oC_PropertyKeyName
oC_SchemaName
oC_SymbolicName
"name"
"}"
")"
" "
oC_Return
"RETURN"
oC_ProjectionBody
" "
oC_ProjectionItems
oC_ProjectionItem
oC_Expression
oC_OrExpression
oC_XorExpression
oC_AndExpression
oC_NotExpression
oC_ComparisonExpression
oC_AddOrSubtractExpression
oC_MultiplyDivideModuloExpression
oC_PowerOfExpression
oC_UnaryAddOrSubtractExpression
oC_StringListNullOperatorExpression
oC_PropertyOrLabelsExpression
oC_Atom
oC_Variable
oC_SymbolicName
"c"
"<EOF>"
I have written some demo code for how to use the Antlr parser (which produced the above output). See PR #518.
There are two ways to consume the parse tree: Either with a tree listener or by manually walking the tree structure. I guess, the first is preferable.
There are two ways get the parser: Either you build directly from the g4 grammar published on the oC webpage or you create the parser from the grammar xml files in this repository (which generate the g4 grammar as an intermediate step). I guess, the first is preferable.
The demo shows all four possible combination of these options.
Hi hvub,
I am very sorry, I obviously messed up the example. Therefore, I created repository https://github.com/dahoat/OpenCypherExperiments with an running example (Maven/Spring). I also have to mention, I am using Visitors
instead of Listeners
.
The query
MATCH (a) WITH a MATCH (b) SET a.prop = 'match 2' WITH a, b MATCH (c) WITH a, b, c MATCH (d) RETURN *
is a much better example to my problem.
All the best, Daniel
Hi Daniel,
I have modified your visitor here https://github.com/dahoat/OpenCypherExperiments/pull/1 to demonstrate how you can reproduce the groups without needing a grammar changes.
Hope it helps. -H