beancount-parser Proposal: a higher level, more structured API

Proposal: a higher level, more structured API

Open SEIAROTg opened this issue 2 years ago • 2 comments

Currently, the output of this parser is the AST from Lark, which I feel is not ideal for a few reasons.

The AST usually requires some common post processing. For example, the following snippets is valid in beancount and the metadata belongs to the posting instead of the transaction.
```
2000-01-01 *
    Assets:Foo        0.00 USD
        Assets:Bar    0.00 USD
    meta: 123
```
Under LALR(1) it's difficult to group the metadata into the right node at parsing stage without ambiguity as both transactions and postings may be followed by metadata. Likely we'll be getting a list of postings and metadata and then figure out what belongs to what as part of post processing. If such common post processing is needed by most valid usages, I think it should be included as part of this project.
The contract is unclear and might change. It's also difficult to make sure the AST structure and rule / token names never change over time, while it's also difficult to draw a clear line what will not change. This makes users (e.g. me) concerned about breaking changes in the future.
Comment doesn't really belong to the AST. I like that this parser keeps comments but this is currently implemented by putting every line into a top-level node, which is making the AST not very structured. If we are to fix #3 and #8 where multi-line nodes will be required, keeping comments as dedicated AST node will become fairly messy. For example, what nodes should these comments belong to?
```
2000-01-01 *
    #tag ; comment
    ; comment
    meta: 123 ; comment
    ^link  ; comment
    ; comment
    Assets:Foo    0.00 USD ; comment
```

To solve these problems, I propose we add a higher level, more structured API, providing abstractions similar to beancount Directive objects for easier, more structured and more stable access, while providing an additional interface to access comments and raw AST nodes for advanced usage.

I see similar work already happened in beancount-black. Maybe we could start by moving them here.

Thoughts?

Jun 06 '22 01:06 SEIAROTg

beancount-parser beancount-parser copied to clipboard

Proposal: a higher level, more structured API

beancount-parser
beancount-parser copied to clipboard