erlfmt
erlfmt copied to clipboard
Add support for formatting .yrl and .xrl files
cc @ilya-klyuchnikov
Jumping on this task, I have some questions:
erlfmt
can process arbitrary files. So am I correct by summarising the task as:
- Manually identify the rules that shouldn't be applied to yrl/xrl.
- Filter them in the code when file have extension yrk/xrl (maybe the ignore-list is different for each, not big deal).
- Mute the warnings (syntax error) for non erlang-like sections (or should we plug a dedicated grammar?)
- Process yrl/xrl files when formatting a whole directory.
Each erlang form ends with a .
In a .yrl
file, these forms can be mixed and either be a normal erlang form or a yrl
specific form.
Currently we warn on any forms (including .yrl
specific forms) that we cannot parse.
We should keep on warning on forms we cannot parse, but we want to add the ability to parse .yrl
specific forms.
Here are some things we will need to add:
- I think we will need to add the ability to parse
.yrl
specific forms into ourerlfmt_parse.yrl
grammar OR we can create a separate grammar that can only parse the specific.yrl
forms and use it as a fallback when parsing normal erlang forms fails. -
erlfmt_recomment.erl
will also need to be able to add comments back into the AST - Finally we need to add the ability to
erlfmt_format.erl
to be able to format these new expressions.
@michalmuskala I think I saw a specification for .yrl
grammars previously, but I can't find it again for some reason. Do you know if I was imagining things or if this exists somewhere?
It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples
I agree adding first-class support for yecc forms would be probably the "nicest" solution
It can be done gradually. - There is the explicit "Erlang code." part of .yrl
files - adding support for understanding and formatting it may be the first step.
It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples
I agree adding first-class support for yecc forms would be probably the "nicest" solution
Amazing thank you, I somehow missed that.
It can be done gradually. - There is the explicit "Erlang code." part of
.yrl
files - adding support for understanding and formatting it may be the first step.
Looks like this actually mostly works today - just issues a lot of warnings. https://github.com/WhatsApp/erlfmt/pull/289
Yes, this task is about also formatting the yrl
specific parts without needing to warn, because with work we should be able to also parse and format it.
But technically, like Ilya maybe meant, this can also be done gradually, by for example, first only supporting declaration
and only in a second pull request support rule
, etc.
Yes, this task is about also formatting the yrl specific parts without needing to warn, because with work we should be able to also parse and format it.
Until everything is supported, I think a good compromise could be to warn once that the file wasn't entirely formatted.
I agree adding first-class support for yecc forms would be probably the "nicest" solution
Then, is there a reference on how it should be formatted?
Anyway, looking at erlfmt_parsle.yrl
as an exhaustive example, it seems that would could only make things worse:
- Nice tabulation would be broken: Right now:
attribute -> '-' 'if' attr_val : build_attribute('$1', '$2', '$3').
attribute -> '-' atom : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val : build_attribute('$1', '$2', '$3').
Potential worse result:
attribute -> '-' 'if' attr_val : build_attribute('$1', '$2', '$3').
attribute -> '-' atom : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val : build_attribute('$1', '$2', '$3').
- Compact rules would take much more vertical space Right now:
type -> type '::' type : ?mkop2('$1', '$2', '$3').
type -> type '|' type : ?mkop2('$1', '$2', '$3').
type -> type '..' type : ?mkop2('$1', '$2', '$3').
Potential worse result:
type -> type '::' type :
?mkop2('$1', '$2', '$3').
type -> type '|' type :
?mkop2('$1', '$2', '$3').
type -> type '..' type :
?mkop2('$1', '$2', '$3').
Or
type ->
type '::' type : ?mkop2('$1', '$2', '$3').
type ->
type '|' type : ?mkop2('$1', '$2', '$3').
type ->
type '..' type : ?mkop2('$1', '$2', '$3').
Great question and thank you for bringing examples in the ask, this points out great things to consider and is making me think.
Of the ideas here, I would prefer
type -> type '::' type :
?mkop2('$1', '$2', '$3').
type -> type '|' type :
?mkop2('$1', '$2', '$3').
type -> type '..' type :
?mkop2('$1', '$2', '$3').
The reason is that erlfmt does not align things anywhere else.
I have looked at how I have manually formatted a bnf for another grammar I worked on and I naturally conformed to this style sometimes. Although we had opening and closing braces for the SDT rule. So it was a bit of an easier style choice.
Trying out some other styles:
attribute ->
'-' 'if' attr_val
: build_attribute('$1', '$2', '$3').
attribute
-> '-' 'if' attr_val
: build_attribute('$1', '$2', '$3').
attribute ->
'-' 'if' attr_val :
build_attribute('$1', '$2', '$3')
.
After that I think I still prefer your idea
attribute -> '-' 'if' attr_val :
build_attribute('$1', '$2', '$3').
But we can discuss this for a while. The first step is parsing and creating a first format. Then it can be easier to iterate with tests.
But great to start thinking about this already. Thank you for bringing this up.
I'm a big fan of vertical alignment. Helps to see the structure / patterns (either it's repeated, or on the contrary the variations are more easily seen). It's also helpful to spot copy-paste mistake.
Some people cannot stand it, so it would be both complicated and controversial to make a pretty printer generate that.
Maybe what we could do at some point is to preserve existing alignements, without the need to flag %% format off
.
Jumping out of this task! (Exercise is good for you). Rationale:
- As @michalmuskala has demonstrated, that is already working for erlang parts of the files.
- Isolating non-erlang parts and plugging corresponding ad-hoc grammars for those would be too much work for too little benefit.
- In my book that would be actually negative benefit (a.k.a nuisance). See discussion about lost of vertical alignment.
- One of the advantage of formatting is uniformity/consistency. Yet yrl and xrl files are very rare in most codebases.
- I suggested to quiet or summarise the warnings for non-erlang parts. But we're rather keep them as is: more explicit, consistent and systematic. A bit verbose for users, yet handier for toolchain integration.
- Available mitigations: exclude such files or use
%% format off
.
- Available mitigations: exclude such files or use