Arpeggio
Arpeggio copied to clipboard
Token not appearing in child nodes
Hello. I'm not quite sure what I'm doing wrong, but I tried to extend your calc example to work with more operators and aid my understanding of how to work with Arpeggio, and everything except the exponent and null rules are working as expected.
I get an IndexError when I use 'NULL' in any expression. I also realized that any expression involving '^' didn't work as expected. For example, with the original (commented out) code, the expression '4 ^ 2' would return 4.0.
I'm sure this is all my own mistake, but if you could clarify this, I'd be most grateful. Thanks in advance.
grammar = '''
number = r'\d+\.{0,1}\d*'
variable = r'[A-Z]+'
null = "NULL"
value = null / number / variable / "(" expression ")"
exponent = value (("^") value)*
product = exponent (("*" / "/") exponent)*
sum = product (("+" / "-") product)*
comparison = sum ((">=" / ">" / "<=" / "<" / "==" / "!=") sum)*
expression = comparison (("&&" / "||") comparison)*
builder = expression+ EOF
'''
OPERATIONS = {
'+': operator.iadd,
'-': operator.isub,
'*': operator.imul,
'/': operator.itruediv,
'^': operator.ipow,
'>=': operator.ge,
'>': operator.gt,
'<=': operator.le,
'<': operator.lt,
'==': operator.eq,
'!=': operator.ne,
'&&': operator.and_,
'||': operator.or_
}
class TreeVisitor(PTNodeVisitor):
def visit_null(self, node, children):
return None
def visit_number(self, node, children):
return float(node.value)
def visit_value(self, node, children):
return children[-1]
def visit_exponent(self, node, children):
# TODO: not sure why the exponent is a special case,
# but the sign isn't being consumed/returned in the parse
# tree
# ---- ORIGINAL CODE ----
# exponent = children[0]
# for i in range(2, len(children), 2):
# sign = children[i - 1]
# exponent = OPERATIONS[sign](exponent, children[i])
#
# return exponent
# ---- END ORIGINAL CODE ----
if len(children) == 1:
return children[0]
exponent = children[0]
for i in children[1:]:
exponent **= i
return exponent
def visit_product(self, node, children):
product = children[0]
for i in range(2, len(children), 2):
sign = children[i - 1]
product = OPERATIONS[sign](product, children[i])
return product
def visit_sum(self, node, children):
total = children[0]
for i in range(2, len(children), 2):
sign = children[i - 1]
total = OPERATIONS[sign](total, children[i])
return total
def visit_comparison(self, node, children):
comparison = children[0]
for i in range(2, len(children), 2):
sign = children[i - 1]
comparison = OPERATIONS[sign](comparison, children[i])
return comparison
def visit_expression(self, node, children):
expression = children[0]
for i in range(2, len(children), 2):
sign = children[i - 1]
expression = OPERATIONS[sign](expression, children[i])
return expression
parser = ParserPEG(grammar, 'builder')
def process_builder_expression(expression):
tree = parser.parse(expression)
return visit_parse_tree(tree, TreeVisitor())
Hi @dodumosu. That is due to the way Arpeggio deals with plain string matches. When you have something in your grammar that will always match the same (string match without choice operator), Arpeggio will suppress that node as it carry no additional information to the analysis. You know that terminal must be there or else your visitor wouldn't be called.
So in case of null and ^ you have a single string match in your rule and thus there will be no node for them in children parameter. In case of, e.g.'+' / '-' you use an ordered choice and you don't know what operation will be passed to the visitor. In that case there will be terminal node for the operation.
That was an early design decision. I'm not sure anymore that it was the right one :)
I was sure that this is documented but can't find anything at the moment. If it's not documented please leave this open until the docs are extended to explain this.
@igordejanovic thanks for your reply. having read the docs again, i've noticed that returning None from a visit_xxx() method removes that node from the parse tree. in the event that you actually need the None to be processed, what do you do? thanks.
Actually, returning None from visitor will remove that value in the upper visitors. The tree is constructed previous to applying visitors so all nodes are there all the time during visiting. What you do with visitors is building some alternative representation of the parse tree (you are not changing the original parse tree). That could be some form of tree, graph or a single scalar value (see the calc example).
If you are asking how can you get None value in the upper visitors anyway. One of the solution is to return some sort of sentinel value that, as a non-None value, will be given to upper visitors but will have None semantic in your domain. As a sentinel value you can use anything that can't be a normal non-None value of the visitor.
Thank you. That was what I did, eventually, then in a higher-level visitor, I transformed the value to None.