Arpeggio icon indicating copy to clipboard operation
Arpeggio copied to clipboard

Token not appearing in child nodes

Open dodumosu opened this issue 7 years ago • 4 comments

Hello. I'm not quite sure what I'm doing wrong, but I tried to extend your calc example to work with more operators and aid my understanding of how to work with Arpeggio, and everything except the exponent and null rules are working as expected.

I get an IndexError when I use 'NULL' in any expression. I also realized that any expression involving '^' didn't work as expected. For example, with the original (commented out) code, the expression '4 ^ 2' would return 4.0.

I'm sure this is all my own mistake, but if you could clarify this, I'd be most grateful. Thanks in advance.

grammar = '''
number = r'\d+\.{0,1}\d*'
variable = r'[A-Z]+'
null = "NULL"
value = null / number / variable / "(" expression ")"
exponent = value (("^") value)*
product = exponent (("*" / "/") exponent)*
sum = product (("+" / "-") product)*
comparison = sum ((">=" / ">" / "<=" / "<" / "==" / "!=") sum)*
expression = comparison (("&&" / "||") comparison)*
builder = expression+ EOF
'''

OPERATIONS = {
    '+': operator.iadd,
    '-': operator.isub,
    '*': operator.imul,
    '/': operator.itruediv,
    '^': operator.ipow,
    '>=': operator.ge,
    '>': operator.gt,
    '<=': operator.le,
    '<': operator.lt,
    '==': operator.eq,
    '!=': operator.ne,
    '&&': operator.and_,
    '||': operator.or_
}

class TreeVisitor(PTNodeVisitor):
    def visit_null(self, node, children):
        return None

    def visit_number(self, node, children):
        return float(node.value)

    def visit_value(self, node, children):
        return children[-1]

    def visit_exponent(self, node, children):
        # TODO: not sure why the exponent is a special case,
        # but the sign isn't being consumed/returned in the parse
        # tree
        # ---- ORIGINAL CODE ----
        # exponent = children[0]
        # for i in range(2, len(children), 2):
        #     sign = children[i - 1]
        #     exponent = OPERATIONS[sign](exponent, children[i])
        #
        # return exponent
        # ---- END ORIGINAL CODE ----
        if len(children) == 1:
            return children[0]

        exponent = children[0]
        for i in children[1:]:
            exponent **= i

        return exponent

    def visit_product(self, node, children):
        product = children[0]
        for i in range(2, len(children), 2):
            sign = children[i - 1]
            product = OPERATIONS[sign](product, children[i])

        return product

    def visit_sum(self, node, children):
        total = children[0]
        for i in range(2, len(children), 2):
            sign = children[i - 1]
            total = OPERATIONS[sign](total, children[i])

        return total

    def visit_comparison(self, node, children):
        comparison = children[0]
        for i in range(2, len(children), 2):
            sign = children[i - 1]
            comparison = OPERATIONS[sign](comparison, children[i])

        return comparison

    def visit_expression(self, node, children):
        expression = children[0]
        for i in range(2, len(children), 2):
            sign = children[i - 1]
            expression = OPERATIONS[sign](expression, children[i])

        return expression


parser = ParserPEG(grammar, 'builder')


def process_builder_expression(expression):
    tree = parser.parse(expression)
    return visit_parse_tree(tree, TreeVisitor())

dodumosu avatar Mar 24 '18 19:03 dodumosu

Hi @dodumosu. That is due to the way Arpeggio deals with plain string matches. When you have something in your grammar that will always match the same (string match without choice operator), Arpeggio will suppress that node as it carry no additional information to the analysis. You know that terminal must be there or else your visitor wouldn't be called.

So in case of null and ^ you have a single string match in your rule and thus there will be no node for them in children parameter. In case of, e.g.'+' / '-' you use an ordered choice and you don't know what operation will be passed to the visitor. In that case there will be terminal node for the operation.

That was an early design decision. I'm not sure anymore that it was the right one :)

I was sure that this is documented but can't find anything at the moment. If it's not documented please leave this open until the docs are extended to explain this.

igordejanovic avatar Mar 25 '18 08:03 igordejanovic

@igordejanovic thanks for your reply. having read the docs again, i've noticed that returning None from a visit_xxx() method removes that node from the parse tree. in the event that you actually need the None to be processed, what do you do? thanks.

dodumosu avatar May 21 '18 11:05 dodumosu

Actually, returning None from visitor will remove that value in the upper visitors. The tree is constructed previous to applying visitors so all nodes are there all the time during visiting. What you do with visitors is building some alternative representation of the parse tree (you are not changing the original parse tree). That could be some form of tree, graph or a single scalar value (see the calc example).

If you are asking how can you get None value in the upper visitors anyway. One of the solution is to return some sort of sentinel value that, as a non-None value, will be given to upper visitors but will have None semantic in your domain. As a sentinel value you can use anything that can't be a normal non-None value of the visitor.

igordejanovic avatar May 21 '18 14:05 igordejanovic

Thank you. That was what I did, eventually, then in a higher-level visitor, I transformed the value to None.

dodumosu avatar May 21 '18 16:05 dodumosu