JuliaFormatter.jl icon indicating copy to clipboard operation
JuliaFormatter.jl copied to clipboard

Full YASGuide implementation

Open ararslan opened this issue 5 years ago • 40 comments

Here are some (hopefully complete) to-dos to make the experimental YASGuide implementation complete, for whomever wishes to work on it. This is the style guide my company uses, so we'd likely adopt JuliaFormatter if it supported it. This guide has a significant amount of overlap with BlueStyle, so adding the functionality outlined below would also increase the ability to support other popular guides. (Indeed, all but a couple of these apply to both YAS and Blue.)

  • [x] Require ; to separate positional and keyword arguments
  • [x] Rewrite x |> f to f(x)
  • [x] Align line-broken arguments to the parent opening parenthesis (format_text errors for me with function arguments on multiple lines)
  • [x] Require explicit return for returned values in long form function definitions and do blocks
  • [x] Rewrite short form function definitions to long form if the line gets too long
  • [x] Expand and paren-wrap arithmetic in indexing expressions (x[i+1:end] -> x[(i + 1):end])
  • [x] Rewrite import X to using X
  • [x] Rewrite @Module.macro to Module.@macro
  • [x] Annotate unannotated fields in type definitions with ::Any
  • [x] Remove spaces around = in named tuple literals
  • [ ] Rewrite function definitions inside of other function definitions to be lambdas

ararslan avatar Mar 06 '20 21:03 ararslan

@ararslan which of these is preferable

  1. fit as much on the same line
comp = [a * b + c for a = 1:10, # comment
        b = 11:20, c = 300:400]
  1. break all lines if there if a comment
comp = [a * b + c for
        a = 1:10, # comment
        b = 11:20,
        c = 300:400]

domluna avatar Mar 23 '20 03:03 domluna

That's a good question, I don't think I've ever seen a comment in the middle of a comprehension. The YASGuide itself definitely doesn't have anything specific to say on this. I'd probably be inclined to prefer the latter just because I would find it visually odd to separate loops that way, but also having the dangling for on the first line feels weird. @jrevels, thoughts?

I guess a third option is

comp = [a * b + c
        for a in 1:10,  # comment
            b in 11:20,
            c in 300:400]

but that indentation feels unsatisfyingly arbitrary.

ararslan avatar Mar 23 '20 04:03 ararslan

Yeah, I don't think the YASGuide mandates anything here, but in my own code, I think I tend to prefer @ararslan's third option.

For complicated multiline comprehension bodies, I'll sometimes use begin...end blocks, e.g.:

comp = [begin
            x = a * b + c
            y = x^2 + 3x # comment 1
        end
        for a in 1:10,  # comment 2
            b in 11:20,
            c in 300:400]

I haven't thought too hard about this preference though 😁

jrevels avatar Mar 23 '20 12:03 jrevels

It's no longer possible to use begin/end in comprehensions as of 1.4, since x[begin] is a thing (mirrors x[end]). You can use let/end I think though.

ararslan avatar Mar 23 '20 16:03 ararslan

It's allowed in untyped comprehensions like this one, but not in typed comprehensions.

StefanKarpinski avatar Mar 23 '20 16:03 StefanKarpinski

Ah, gotcha.

ararslan avatar Mar 23 '20 16:03 ararslan

I would find it visually odd to separate loops that way, but also having the dangling for on the first line feels weird

should this be formatted differently then?

https://github.com/beacon-biosignals/Onda.jl/blob/master/examples/tour.jl#L61-L62

domluna avatar Mar 25 '20 05:03 domluna

I'd say so, good catch :smile:

ararslan avatar Mar 25 '20 16:03 ararslan

@ararslan

Rewrite function definitions inside of other function definitions to be lambdas

can you give an example?

domluna avatar Mar 26 '20 04:03 domluna

Rewrite import X to using X

@StefanKarpinski are there a cases where import would be preferred instead of using?

domluna avatar Mar 26 '20 04:03 domluna

If you want to add methods to the binding without qualification then import is required. So, suppose you have this:

module A
    export f
    function f end
end

Then this doesn't work:

julia> module B
           using ..A: f
           f() = 1
       end
ERROR: error in method definition: function A.f must be explicitly imported to be extended
Stacktrace:
 [1] top-level scope at none:0
 [2] top-level scope at REPL[2]:3

Whereas this does:

julia> module B
           import ..A: f
           f() = 1
       end
Main.B

I don't particularly like this distinction and argued against it pre-1.0, but there you have it. If, however, you use qualified names to extend imported functions then you can always use using:

julia> module B
           using ..A: A, f
           A.f() = 1
       end
WARNING: replacing module B.
Main.B

Note, however, that if you're explicitly listing the names to include, you have to list A itself or you won't be able to write A.f inside of B. So this whole thing is a bit of a mess.

StefanKarpinski avatar Mar 26 '20 15:03 StefanKarpinski

hmmm ok, I was wondering whether I should make that conversion the default but it looks like it could break code in some cases

domluna avatar Mar 26 '20 15:03 domluna

There's definitely a transformation that could be done, but it's more annoying and non-local than one would want. The normalization that would be safe is this:

  • using A replace by using A: names... with explicitly listed names from A that are used.
  • import A replace by using A: A or drop it if A is never referenced
  • import A: x
    • if x is not a function which is extended, replace by using A: x
    • if x is a function which is extended, replace by using A: A, x and qualify the reference to x in the method extension signature as A.x

So yeah, that's a very annoying and fussy normalization process just to unify on using. It's simpler if you go the other direction and transform using into import:

  • using A replaced by import A: names... with explicitly listed names from A that are used.
  • using A: x replaced by import A: x

I think that should never break code. And of course, you'd want to combine import statements that come from the same module. I do also think that it's good style to fully qualify extensions to imported methods just to be 100% clear that's what you're doing.

StefanKarpinski avatar Mar 26 '20 15:03 StefanKarpinski

hmmm ok, I was wondering whether I should make that conversion the default but it looks like it could break code in some cases

For the purposes of YASGuide compliance, import A could become using A: A, which is what we've been doing when we need that particular behavior. So that would be the most conservative change the formatter could make for compliance.

ararslan avatar Mar 26 '20 16:03 ararslan

Rewrite function definitions inside of other function definitions to be lambdas

can you give an example?

function f(x)
    function g(y)
        # do whatever
        return thing
    end
    return something
end

should instead be

function f(x)
    g = y -> begin
        # do whatever
        return thing
    end
    return something
end

ararslan avatar Mar 26 '20 16:03 ararslan

I do also think that it's good style to fully qualify extensions to imported methods just to be 100% clear that's what you're doing.

Yup, this is a requirement of the YASGuide anyway:

When overloading a function from another module, the function name should be qualified with its module (e.g. imported_function(...) = ... is bad, ParentModule.imported_function(...) = ... is good).

jrevels avatar Mar 26 '20 16:03 jrevels

sample formatting so far https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt

julia> using JuliaFormatter

julia> format(".",
           style = YASStyle(),
           always_for_in = true,
           whitespace_ops_in_indices = true,
           whitespace_typedefs = false,
           remove_extra_newlines = true,
           import_to_using = true,
           pipe_to_function_call = true,
           short_to_long_function_def = true,
           margin=92
       )

domluna avatar Apr 02 '20 05:04 domluna

Requiring

function f(x)
    g = y -> begin
        # do whatever
        return thing
    end
    return something
end

prevents defining inner functions that have multiple methods. Seems like an overreaction to the fact that putting method definitions in the branches of a conditional doesn't do what one expects.

StefanKarpinski avatar Apr 02 '20 05:04 StefanKarpinski

prevents defining inner functions that have multiple methods. Seems like an overreaction to the fact that putting method definitions in the branches of a conditional doesn't do what one expects.

Yeah, Eric Davies and I had a good chat about this rule in Slack a while back; looks like history has been erased by now though 🤦‍♂should've saved it.

To provide a concrete example in the vein of what you said, the rule is there in an attempt to make it less likely for people to confuse themselves by doing stuff like

julia> f(::Int) = 1
f (generic function with 1 method)

julia> function g(x::T) where {T}
           f(::T) = 2
           f(x)
       end
g (generic function with 1 method)

julia> g(1.0)
2

julia> f(1.0)
ERROR: MethodError: no method matching f(::Float64)
Closest candidates are:
  f(::Int64) at REPL[1]:1
Stacktrace:
 [1] top-level scope at REPL[4]:1

...where a lot of newcomers will get confused/frustrated by the fact that there wasn't some persistent top-level method overload. Not that that's a reasonable expectation on their part, just a common mistake I've seen folks trip up on.

IIRC the conclusion of the chat with @iamed2 was that inner method overloads were useful enough as a feature that this rule was deemed overly defensive and not worthwhile for BlueStyle. In practice, I think you can often (but maybe not always? haven't thought about it) refactor actual usages of this feature to instead use explicitly callable types instead of inner methods. My preference for that is much more subjective and less substantiated though. This isn't a super critical rule for me, and I could be convinced to change it; for example, if we ever went through the massive bikeshed required to merge YASGuide and BlueStyle, this would be a rule I'd easily give up for the sake of compromise.

jrevels avatar Apr 02 '20 15:04 jrevels

sample formatting so far beacon-biosignals/[email protected]:test-yasfmt

This actually introduces a few guide violations:

  1. Should have a return: https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-26bf7f3fc584320d0e91db14601713bfR42 (few other places as well)
  2. Space around = here was added, should have been untouched: https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-26bf7f3fc584320d0e91db14601713bfR48

Couple of "not sures", would be good to have @jrevels chime in on them:

  • Seems like this probably should have stayed on the same line with the if wrapped instead of the for? https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-297becf209b58cadeab45c3274a77911R163
  • Seems like this should be indented? https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-26bf7f3fc584320d0e91db14601713bfR36

ararslan avatar Apr 02 '20 16:04 ararslan

Seems like this probably should have stayed on the same line with the if wrapped instead of the for? beacon-biosignals/[email protected]:test-yasfmtdiff-297becf209b58cadeab45c3274a77911R163

bug

Space around = here was added, should have been untouched: beacon-biosignals/[email protected]:test-yasfmtdiff-26bf7f3fc584320d0e91db14601713bfR48

bug

Should have a return: beacon-biosignals/[email protected]:test-yasfmtdiff-26bf7f3fc584320d0e91db14601713bfR42 (few other places as well)

not implemented yet

domluna avatar Apr 02 '20 17:04 domluna

Thanks for confirming :slightly_smiling_face:

ararslan avatar Apr 02 '20 17:04 ararslan

@ararslan

For the following

     function foo()
           if true
               10
           else
               20
           end
       end

would you want return inserted as well?

     function foo()
           return if true
               10
           else
               20
           end
       end

domluna avatar Apr 05 '20 01:04 domluna

Yep 👍

ararslan avatar Apr 05 '20 02:04 ararslan

Made more improvements:

https://github.com/jrevels/Cassette.jl/compare/master...domluna:yasfmt https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt

domluna avatar Apr 05 '20 20:04 domluna

I'm going to clean up the draft branch and merge it in since there's non YAS specific improvements in there as well.

Any other additions / fixes will follow. Also, I'm going to hold off on the last 2 items on this list for now - there's a couple other items I'd like to investigate in the meantime.

domluna avatar Apr 07 '20 01:04 domluna

This is part of v0.4 now

domluna avatar Apr 08 '20 14:04 domluna

I just had a go at running JuliaFormatter on a small non-open source YASStyle repo and it worked pretty well, as vetted by @ararslan! Just ran into two things, first I think the method

function p_kw(style::YASStyle, cst::CSTParser.EXPR, s::State)
    t = FST(cst, 0)
    for a in cst
        add_node!(t, pretty(style, a, s), s, join_lines = true)
    end
    return t
end

taken from the example https://domluna.github.io/JuliaFormatter.jl/dev/custom_styles/ should be added for YASStyle, since not having space around the = in keyword arguments is part of the style (point 2.2 of https://github.com/jrevels/YASGuide#linealignmentspacing-guidelines).

The second is that

function f()
    for long_variable_name in ("Fusce eu justo at purus finibus sagittis.",
                               "Nulla egestas magna vitae lacus.",
                               "Aenean finibus nisl at magna feugiat finibus.",
                               "Etiam sodales ligula a hendrerit efficitur.",
                               "Vestibulum sed lorem vel massa consequat.",
                               "Nulla ut turpis pretium, sollicitudin.")
        println(long_variable_name)
    end
    return 1
end

is YAS-compliant but gets undesirably formatted into

function f()
    for long_variable_name in
        ("Fusce eu justo at purus finibus sagittis.", "Nulla egestas magna vitae lacus.",
         "Aenean finibus nisl at magna feugiat finibus.",
         "Etiam sodales ligula a hendrerit efficitur.",
         "Vestibulum sed lorem vel massa consequat.",
         "Nulla ut turpis pretium, sollicitudin.")
        println(long_variable_name)
    end
    return 1
end

I'm not really sure what rule could be made but it's better to have the list start on the first line rather than have 2 items on the second line.

ericphanson avatar Jul 10 '20 16:07 ericphanson

Not sure about YAS, but it seems to me that the preferred formatting of the latter example would be:

function f()
    for long_variable_name in (
            "Fusce eu justo at purus finibus sagittis.",
            "Nulla egestas magna vitae lacus.",
            "Aenean finibus nisl at magna feugiat finibus.",
            "Etiam sodales ligula a hendrerit efficitur.",
            "Vestibulum sed lorem vel massa consequat.",
            "Nulla ut turpis pretium, sollicitudin.",
        )
        println(long_variable_name)
    end
    return 1
end

StefanKarpinski avatar Jul 10 '20 17:07 StefanKarpinski

@ericphanson

for the 1st point you can set whitespace_in_kwargs to false (defaults to true). Updated the docs to show this, it was only shown in the docstring of YASStyle before.

For the 2nd point it's nesting/breaking on in. If this is undesirable for YAS, which it might be due to how the rest of YAS is formatted, the nesting rules can changed.

domluna avatar Jul 10 '20 18:07 domluna