prql Add comments to `fmt`

What's up?

We discussed comments (and associated aesthetic items, such as significant newlines) on Discord. A couple of options for how to do this:

Add to the AST — change the Parser to parse comments, attaching them to AST nodes. For example, there would be a comments field on the Expr struct containing the comments before the expression.
- This keeps context local — each node can format itself without requiring the context of the full query
- But it would mean lots of changes to the parser. Some nodes aren't Expr and so would require a different design (e.g. a comment above an annotation wouldn't fit anywhere; similarly we'd need to add this to Stmt too.)
Require passing in the source into WriteSource, use the source to find and write the comments while writing the query.
- This is less dev work, because comments don't need to go through the PL.
- But it means formatting something requires global context, which is less modular and harder to maintain

IIUC these are the main two approaches. There are variations — for example having comments part of a CST but not AST; though in our case we don't have a CST.

Any thoughts?

Jan 20 '24 21:01 max-sixty

Discussed on the dev call — let's go for the second approach — "Require passing in the source into WriteSource"

Jan 23 '24 19:01 max-sixty

What is the status on this? Is someone actively working on getting comments working?

Jul 01 '24 04:07 m-span

Ah, I haven't been keeping this issue up-to-date, thanks for the ping @m-span .

Relevant links: https://github.com/PRQL/prql/pull/4397#issuecomment-2187146337, https://github.com/PRQL/prql/pull/4639

Briefly:

The existing formatter uses PL. But PL doesn't have comments, which are only stored in tokens
We could rewrite the formatter to use LR tokens, referencing the PL to understand where things should go. That would be lots of rewriting
I tried two different approaches to avoid that work (each referenced above)
- Format the PL into PRQL (as before), and attempt to write interspersed comments by referencing LR Tokens. This is hard, since the writer needs to carry a lot of state (in particular, because the PL is nested, there are multiple PL Expr that are contiguous to a comment LR token, so we don't want to write out a comment token whenever a comment follows an Expr)
- Add comments into the PL AST. This almost works, but fails on one narrow case, and I can see some other looming issues

I'm not confident on the best next steps. There's a mini-goal of getting DocComments in the PL, which are much easier than comments, and uses a subset of #4639. Then I think we need to decide whether to a) try and take one of those approaches and add enough hacks that it works or b) rewrite the formatter from the perspective of LR tokens.

Jul 01 '24 20:07 max-sixty

prql prql copied to clipboard

Add comments to `fmt`

What's up?

prql
prql copied to clipboard