beancount-parser
beancount-parser copied to clipboard
Proposal: a higher level, more structured API
Currently, the output of this parser is the AST from Lark, which I feel is not ideal for a few reasons.
-
The AST usually requires some common post processing. For example, the following snippets is valid in beancount and the metadata belongs to the posting instead of the transaction.
2000-01-01 * Assets:Foo 0.00 USD Assets:Bar 0.00 USD meta: 123
Under LALR(1) it's difficult to group the metadata into the right node at parsing stage without ambiguity as both transactions and postings may be followed by metadata. Likely we'll be getting a list of postings and metadata and then figure out what belongs to what as part of post processing. If such common post processing is needed by most valid usages, I think it should be included as part of this project.
-
The contract is unclear and might change. It's also difficult to make sure the AST structure and rule / token names never change over time, while it's also difficult to draw a clear line what will not change. This makes users (e.g. me) concerned about breaking changes in the future.
-
Comment doesn't really belong to the AST. I like that this parser keeps comments but this is currently implemented by putting every line into a top-level node, which is making the AST not very structured. If we are to fix #3 and #8 where multi-line nodes will be required, keeping comments as dedicated AST node will become fairly messy. For example, what nodes should these comments belong to?
2000-01-01 * #tag ; comment ; comment meta: 123 ; comment ^link ; comment ; comment Assets:Foo 0.00 USD ; comment
To solve these problems, I propose we add a higher level, more structured API, providing abstractions similar to beancount Directive
objects for easier, more structured and more stable access, while providing an additional interface to access comments and raw AST nodes for advanced usage.
I see similar work already happened in beancount-black
. Maybe we could start by moving them here.
Thoughts?