JuliaFormatter.jl icon indicating copy to clipboard operation
JuliaFormatter.jl copied to clipboard

[Feature Request] Skip individual expression

Open akriegman opened this issue 2 years ago • 6 comments

It would be nice if there was a way to skip a single expression. It gets a bit cumbersome to write #! format: off and #! format: on before and after everything I want skipped. It would be cool if we could do #=! format: skip =# and the next expression would get skipped. For example:

# Here the entire begin block gets skipped
a = #=! format: skip =# begin
  silly   =
    formatting
      end

# Here the second vector gets formatted normally
b = #=! format: skip =# [1,2,3,4] + [5, 6, 7, 8]

Not sure what the ideal syntax would be. Maybe #= format: skip =# would be better, but the ! would be more consistent with the existing syntax. It might be nice if we have the option to make it more compact too, like #=format:skip=#.

I'm taking inspiration here from rustfmt. It lets you skip a subexpression like this with #[rustfmt::skip], but only if you turn on a special feature in the nightly compiler, lol.

akriegman avatar Feb 05 '22 18:02 akriegman

Maybe a middle ground would be #! format: skip for a single line?

goerz avatar Feb 06 '22 06:02 goerz

is it common to put the skip comment after a variable name? that seems super weird, is that how Rust does it?

domluna avatar Feb 18 '22 23:02 domluna

is it common to put the skip comment after a variable name? that seems super weird, is that how Rust does it?

The idea is it can go anywhere in the expression. It essentially prunes an entire branch of the syntax tree. A simple algorithm could be:

find the next token after the skip comment
find the smallest branch of the AST which includes that token
don't format any code in that branch

Here's a weird example from my Rust code:

for lib in #[rustfmt::skip] [
  if pos.x > 0                      { Some(Point { x: pos.x - 1, ..pos }) } else { None },
  if pos.y > 0                      { Some(Point { y: pos.y - 1, ..pos }) } else { None },
  if pos.x < self.size as isize - 1 { Some(Point { x: pos.x + 1, ..pos }) } else { None },
  if pos.y < self.size as isize - 1 { Some(Point { y: pos.y + 1, ..pos }) } else { None },
]
.into_iter()
.flatten()
{
  // ...
}

Here we are iterating over an inlined array. The skip attribute comes right before that array, and so the entire array is left unformatted. The rest of the loop is formatted as normal.

Maybe a middle ground would be #! format: skip for a single line?

That would be useful, but if we think about the Julia equivalent of my above example, it would mean I'd have to skip the entire for loop when I only want to skip that one subexpression. That is if you meant "skip the expression / block / statement starting with the next line". If you meant "skip just one line", then that could be interesting... in the above example I'd have to skip those two lines individually, which would be weird, or I could use the on and off approach to skip that whole array. The #= format: skip =# approach would allow more targetted pruning of the syntax tree. I guess maybe skipping a single line might be easier to implement using existing machinery, but this expression skipping approach would be a more complete solution.

akriegman avatar Feb 27 '22 12:02 akriegman

Ok, I see! That's actually pretty cool haha.

The difficult part about doing something like this right now is the original source fragments gathered for "no format regions" is done prior to when the syntax tree is formed, and it's done directly on the token stream in a pre-processing step.

What we would need is something like a mapping of syntax node => original source text. CSTParser sort of does this but it has pitfalls which is why for JuliaFormatter we have to do a good amount of bookkeeping ourselves. https://github.com/domluna/JuliaFormatter.jl/blob/master/src/document.jl#L51-L207

I'm curious if this is something that could be easier with https://github.com/c42f/JuliaSyntax.jl

cc @c42f

domluna avatar Mar 01 '22 17:03 domluna

JuliaSyntax.jl would probably offer a good way to do this — it initially parses the file into a ParseStream data structure which contains a vector of tokens and a vector of syntax tree node spans. After you have a ParseStream you can call build_tree to get a syntax tree from it.

However, with a ParseStream you can also rearrange how trivia is associated to the syntax tree nodes by traversing the list of tokens and applying heuristics to attach trivia tokens in various different ways. For a look at how this can be done, see https://github.com/c42f/JuliaSyntax.jl/pull/22 which rearranges nodes to match the CSTParser rules for trivia attachment.

Notably, this is all done before the syntax tree is constructed and on a linear rather than hierarchical representation of the source code. So it's a convenient representation for algorithms such as the one suggested by @akriegman.

c42f avatar Mar 16 '22 07:03 c42f

Awesome! Sounds as if it can simplify current approaches and open the door to features like this one :+1:

domluna avatar Mar 16 '22 16:03 domluna