erlfmt Add support for formatting .yrl and .xrl files

cc @ilya-klyuchnikov

Jul 03 '20 16:07 michalmuskala

Jumping on this task, I have some questions:

erlfmt can process arbitrary files. So am I correct by summarising the task as:

Manually identify the rules that shouldn't be applied to yrl/xrl.
Filter them in the code when file have extension yrk/xrl (maybe the ignore-list is different for each, not big deal).
Mute the warnings (syntax error) for non erlang-like sections (or should we plug a dedicated grammar?)
Process yrl/xrl files when formatting a whole directory.

May 12 '21 14:05 slaykachu

Each erlang form ends with a . In a .yrl file, these forms can be mixed and either be a normal erlang form or a yrl specific form. Currently we warn on any forms (including .yrl specific forms) that we cannot parse. We should keep on warning on forms we cannot parse, but we want to add the ability to parse .yrl specific forms.

Here are some things we will need to add:

I think we will need to add the ability to parse .yrl specific forms into our erlfmt_parse.yrl grammar OR we can create a separate grammar that can only parse the specific .yrl forms and use it as a fallback when parsing normal erlang forms fails.
erlfmt_recomment.erl will also need to be able to add comments back into the AST
Finally we need to add the ability to erlfmt_format.erl to be able to format these new expressions.

May 12 '21 14:05 awalterschulze

@michalmuskala I think I saw a specification for .yrl grammars previously, but I can't find it again for some reason. Do you know if I was imagining things or if this exists somewhere?

May 12 '21 14:05 awalterschulze

It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples

I agree adding first-class support for yecc forms would be probably the "nicest" solution

May 12 '21 14:05 michalmuskala

It can be done gradually. - There is the explicit "Erlang code." part of .yrl files - adding support for understanding and formatting it may be the first step.

May 12 '21 14:05 ilya-klyuchnikov

It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples

I agree adding first-class support for yecc forms would be probably the "nicest" solution

Amazing thank you, I somehow missed that.

Screen Shot 2021-05-12 at 3 55 39 PM

May 12 '21 14:05 awalterschulze

It can be done gradually. - There is the explicit "Erlang code." part of .yrl files - adding support for understanding and formatting it may be the first step.

Looks like this actually mostly works today - just issues a lot of warnings. https://github.com/WhatsApp/erlfmt/pull/289

May 12 '21 15:05 michalmuskala

Yes, this task is about also formatting the yrl specific parts without needing to warn, because with work we should be able to also parse and format it.

May 12 '21 15:05 awalterschulze

But technically, like Ilya maybe meant, this can also be done gradually, by for example, first only supporting declaration and only in a second pull request support rule, etc.

May 12 '21 15:05 awalterschulze

Yes, this task is about also formatting the yrl specific parts without needing to warn, because with work we should be able to also parse and format it.

Until everything is supported, I think a good compromise could be to warn once that the file wasn't entirely formatted.

May 13 '21 10:05 slaykachu

I agree adding first-class support for yecc forms would be probably the "nicest" solution

Then, is there a reference on how it should be formatted? Anyway, looking at erlfmt_parsle.yrl as an exhaustive example, it seems that would could only make things worse:

Nice tabulation would be broken: Right now:

attribute -> '-' 'if' attr_val               : build_attribute('$1', '$2', '$3').
attribute -> '-' atom                        : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val               : build_attribute('$1', '$2', '$3').

Potential worse result:

attribute -> '-' 'if' attr_val : build_attribute('$1', '$2', '$3').
attribute -> '-' atom : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val : build_attribute('$1', '$2', '$3').

Compact rules would take much more vertical space Right now:

type -> type '::' type : ?mkop2('$1', '$2', '$3').
type -> type '|' type : ?mkop2('$1', '$2', '$3').
type -> type '..' type : ?mkop2('$1', '$2', '$3').

Potential worse result:

type -> type '::' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '|' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '..' type : 
    ?mkop2('$1', '$2', '$3').

Or

type -> 
    type '::' type : ?mkop2('$1', '$2', '$3').
type -> 
    type '|' type : ?mkop2('$1', '$2', '$3').
type -> 
    type '..' type : ?mkop2('$1', '$2', '$3').

May 18 '21 11:05 slaykachu

Great question and thank you for bringing examples in the ask, this points out great things to consider and is making me think.

Of the ideas here, I would prefer

type -> type '::' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '|' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '..' type : 
    ?mkop2('$1', '$2', '$3').

The reason is that erlfmt does not align things anywhere else.

I have looked at how I have manually formatted a bnf for another grammar I worked on and I naturally conformed to this style sometimes. Although we had opening and closing braces for the SDT rule. So it was a bit of an easier style choice.

Trying out some other styles:

attribute ->
    '-' 'if' attr_val               
    : build_attribute('$1', '$2', '$3').

attribute
    -> '-' 'if' attr_val               
    : build_attribute('$1', '$2', '$3').

attribute -> 
    '-' 'if' attr_val :
    build_attribute('$1', '$2', '$3')
.

After that I think I still prefer your idea

attribute -> '-' 'if' attr_val :
    build_attribute('$1', '$2', '$3').

But we can discuss this for a while. The first step is parsing and creating a first format. Then it can be easier to iterate with tests.

But great to start thinking about this already. Thank you for bringing this up.

May 18 '21 12:05 awalterschulze

I'm a big fan of vertical alignment. Helps to see the structure / patterns (either it's repeated, or on the contrary the variations are more easily seen). It's also helpful to spot copy-paste mistake.

Some people cannot stand it, so it would be both complicated and controversial to make a pretty printer generate that. Maybe what we could do at some point is to preserve existing alignements, without the need to flag %% format off.

May 19 '21 13:05 slaykachu

Jumping out of this task! (Exercise is good for you). Rationale:

As @michalmuskala has demonstrated, that is already working for erlang parts of the files.
Isolating non-erlang parts and plugging corresponding ad-hoc grammars for those would be too much work for too little benefit.
In my book that would be actually negative benefit (a.k.a nuisance). See discussion about lost of vertical alignment.
One of the advantage of formatting is uniformity/consistency. Yet yrl and xrl files are very rare in most codebases.
I suggested to quiet or summarise the warnings for non-erlang parts. But we're rather keep them as is: more explicit, consistent and systematic. A bit verbose for users, yet handier for toolchain integration.
- Available mitigations: exclude such files or use %% format off.

Jun 14 '21 11:06 slaykachu

erlfmt erlfmt copied to clipboard

Add support for formatting .yrl and .xrl files

erlfmt
erlfmt copied to clipboard