lark
lark copied to clipboard
How can I convert the output of lark().parse() into JSON
What is your question?
I have a working grammar which I can parse into an abstract syntax tree. I can output it to the console as an object, as a "pretty" version or to a png picture. Is it possible to convert this into a JSON representation so that I can use the AST in another program?
If you're having trouble with your code or grammar
with open('grammar.lark', 'r') as grammar:
parser = Lark(grammar, start='start')
with open('vegancupcakes.json', 'r') as recipe:
for i in json.load(recipe):
print(parser.parse(i).pretty())
tree.pydot__tree_to_png(parser.parse(input), 'ast.png')
I tried to look in the documentation and in the source code but I wasn't able to find if the tree class has any other methods. Sorry if it's there but I was unable to find it. What I'm looking for is a method like tree.pydot__tree_to_png() but to JSON instead of png. Failing that, is there another library I can use or an alternative way of achieving the same thing?
Thank you
Is there a problem with creating your own method? It is not very complex.
@leliamesteban You should implement a Transformer that turns the tree into JSON. There are no built-in methods for that in Lark.
I have created a standalone example containing functions tree_to_json_str and tree_to_json:
https://gist.github.com/charles-esterbrook/9ab557d70391fd85ebac2b1a59a326cf
You can adapt to your needs. Another approach would be to convert the tree to Python data and then use json.dump/dumps.
This ticket can be closed.
Not the question asked, but after having used the script of @charles-esterbrook I realised that the YAML format may be a better fit for an AST. The code is cleaner, too:
def ast_to_yaml(node, indent=""):
indent += " "
if isinstance(node, Token):
yield f"{indent}- type: {node.type}"
yield f"{indent} value: {repr(node.value)}"
else:
yield f"{indent}- type: {node.data}"
yield f"{indent} children:"
for child in node.children:
yield from ast_to_yaml(child, indent)
Example of output:
- type: start
children:
- type: line
children:
- type: entity_clause
children:
- type: entity_name_def
children:
- type: box_name
children:
- type: BOX_NAME
value: 'FOO'
- type: COLON
value: ': '
- type: seq
children:
- type: entity_or_table_attr
children:
- type: typed_attr
children:
- type: attr
children:
- type: ATTR
value: 'bar'
- type: NL
value: '\n'
- type: line
children:
- type: NL
value: '\n'