cljfmt Add associative syntax alignment, WIP

Fixes #36

This PR is my initial pass at adding an associative syntax alignment feature to cljfmt. I ran it against a small sized codebase with good results. Before I go any farther, I'd like to gather comments.

Unanswered questions:

1: Currently, bindings are detected by looking for vectors that are preceded by one of the following symbols: doseq let loop binding with-open go-loop if-let when-some if-some for with-local-vars with-redefs

This list is found in cljfmt.core. Is that acceptable or should it be moved to a file in resources/, like the custom indents?

2: What about extending the list? Should this be allowed? If so, how? As configs?

3: Did I miss any?

3: The two special forms doseq and for allow for special keywords: :let :while :when. Currently, they are aligned just like anything else found in a binding like this:

(for [long-thing (bar)
      foo        (baz)
      :let       [bar (baz)]]])

Or should they be ignored, like this?

(for [long-thing (bar)
      foo        (baz)
      :let [bar (baz)]]])

4: What about when maps and bindings are purposely split on two lines to save space. This is often done with nested maps:

{:very-long-map-keyword "value"
 :very-long-map-keyword 
   "very long value that would not fit into 80 columns any other way"}

Comments from my team indicate that they valued determinism in how alignments are done more than fitting things into 80 columns but I know people will argue the other way. One possible solution would be to add a config setting to skip aligning any key-values that are split as shown above. A different solution would be to put a note in the readme to tell people to add an empty comment in between key-values that they wish to remain split:

{:foo 1
 :bar ;;
 "This key-value stays split because of comment semicolons above."}

The behavior shown above works correctly in the current branch.

Jul 25 '16 20:07 jmorris0x0

First, thanks for your work on this!

The forms that are used to check for alignments should be customizable. Perhaps something like:

:align-args {'let [0], 'loop [0], ...}

Where [0] is a vector of argument positions to treat as alignable. In most cases this will be the first element, i.e. the zero index.

The default forms should be stored in a separate resource file, to be consistent with the way indentation is treated.

My feeling is that alignment should only care about arguments on the same line. Joining or splitting lines should be a separate processor.

Jul 26 '16 00:07 weavejester

I feel the code could be a fair bit smaller than it is. There are functions like margin and index-of already in the codebase that I think would make the job a lot easier.

If we just take the alignment of maps, then I'd imagine something like:

(defn- max-value-margin [zloc]
  (loop [zloc (z/right zloc), n 0]
    (let [n (max n (margin zloc))]
      (if-let [zloc (some-> zloc z/right z/right)]
        (recur zloc n)
        n))))

(defn- pad-node [zloc max-margin]
  (zip/insert-left zloc (whitespace (- max-margin (margin zloc)))))

(defn- align-elements [zloc]
  (let [max-margin (max-value-margin zloc)]
    (edit-all zloc (comp even? index-of) #(pad-node % max-margin)))

(defn align-map-elements [form]
  (transform form edit-all z/map? align-elements))

Disclaimer: I haven't tested this code.

Jul 26 '16 00:07 weavejester

IMO if a user has inserted one or more newlines (which is as of recently its own token rather than being just whitespace) into a binding vector between a name and the bound value that should receive the treatment currently afforded to an empty comment. I've never encountered the empty comment before in reading code, and would much rather see support for the existing pattern of manual breaks already supported by CIDER's binding alignment algorithm.

Jul 26 '16 03:07 arrdem

That said, thank you very much for putting together this PR. I've really wanted this feature for a while now :smile:

Jul 26 '16 03:07 arrdem

Hi @jmorris0x0, nice job! Can anyone tell me status of this feature? I tried to rebase this code in my own fork and adjust it a little bit. Now I'm using it and so far so good.

Jun 02 '17 12:06 erthalion

I am also curious about the status of this feature. I am using cljfmt as the formatting ”engine” of formatting Calva Formatter and this feature would be an awesome add.

Nov 01 '18 13:11 PEZ

Is there something I can do to help this PR forward?

Nov 19 '18 21:11 PEZ

FWIW, I have applied @erthalion's branch in my fork and adjusted it to work with latest cljfmt. It is included as an experimental command in latest Calva Formatter, where it can be applied on a map by map (or bindings box by bindings box) manner. It looks like so in action:

align-items

Using it interactively is a good way to find out when it behaves and when it does not. (I've found some cases where it doesn't really do the right thing. It is not yet where many projects could use it to format all their code, I think.)

Using it in Calva Formatter to experiment with this is also a good way to really ”feel” how great the feature is. I hope that many of you out there will give it a try.

If I understand things in this thread, you want some more work on the PR before it is merged, @weavejester? I haven't done that work, yet, and am not sure I know how to do it. But still would like to help in whatever way to get this feature in. I might even tackle the work to fix the code, but then I will need some assistance from someone more experienced.

Nov 26 '18 08:11 PEZ

@PEZ, yes, the PR needs more work before it can be merged. I've detailed in the comments the things that need to be changed.

Nov 27 '18 21:11 weavejester

Hello guys Is there any chance to see features from this PR in cljfmt? Can I help with something to make it real?)

Jan 09 '19 20:01 tkachenko1503

@tkachenko1503 See my previous comment and code review.

Jan 09 '19 22:01 weavejester

I'm wondering if I should take a stab at making this merge-ready.

@weavejester, to help me visualize, when you find a moment, would you be so kind as to provide examples of how :align-args would affect code?

Feb 15 '19 17:02 lread

@lread :align-args simply denotes which arguments of a form should be aligned. So for instance in a let form:

(let [foo 1
      y   2]
  (+ foo y))

We only want to align the values in the vector, i.e. the first argument.

Feb 15 '19 17:02 weavejester

Thanks @weavejester that makes sense, can you contrive a examples for me where an :align-args vector would not be [0]?

Feb 15 '19 19:02 lread

Here's an example from Schema:

(s/defrecord StampedNames
  [date  :- Long
   names :- [s/Str]])

Notice that in this case, the pieces being aligned change as well.

Feb 15 '19 19:02 weavejester

Oh that's a good example, thank you.

Feb 16 '19 21:02 lread

@weavejester, in a comment above, you wrote:

My feeling is that alignment should only care about arguments on the same line. Joining or splitting lines should be a separate processor.

This seems appropriate because it is consistent with current cljfmt behavior, but I wonder what the alignment behavior should be in certain cases.

When elements per line is not always 2, for examples...

{:a 1 :billy
2}

{:a 1 :billy 2
 :sally 3}

... I am thinking we should leave the above as is and not attempt to align.

I suppose if the elements per line is always the same multiple of 2...

{:a 10 :alexandra 2
 :sally 3 :roger 4}

...we could align like so:

{:a     10 :alexandra 2
 :sally 3  :roger     4}

... but I am leaning toward not aligning the code in this case either as I am suspecting it might be more surprising than helpful.

When you have a moment, It'd be great to hear your thoughts.

Feb 21 '19 20:02 lread

What about if we there are N elements to a line, then we align each of the N elements into its own column? e.g.

[:one   :two
 :three :four
 :five 
 :six   :seven :eight 
 :nine  :ten   :eleven]

So we look through the data structure, counting the column size and number of columns, then align appropriately.

This would cover the let use-case, and the Schema use-case.

Feb 21 '19 23:02 weavejester

Hmm... until you proposed this, I was thinking of supporting configuring the number of expected elements per line. This would have be 2 for all our language defaults and customizable (3 for Schema use-case).

Your idea is simpler. Would the following behavior offend anyone?

{:i "have" :only "put" :this "on" :two "lines" :because "it" 
 :is-rather "somewhat" :a-bit "long"}

becomes:

{:i         "have"     :only  "put" :this "on" :two "lines" :because "it"
 :is-rather "somewhat" :a-bit "long"}

Seems ok to me.

But if the author did not keep keys and values together the alignment might be more of a hinderance than a help?

{:i   "have"     :only      "put"  :this "on" :two "lines" :because 
 "it" :is-rather "somewhat" :a-bit "long"}

Feb 21 '19 23:02 lread

We can add an option to turn it off for maps, but my initial thought is that if you have a map that long, you probably want to write it with a keypair on each line, anyway.

As for a map that doesn't keep keys and values together, perhaps another PR can introduce reformatting there.

Feb 21 '19 23:02 weavejester

Very good, thanks, that all seems sensible to me!

Feb 21 '19 23:02 lread

@weavejester do you feel using our new align-associative? should require that indentation? be enabled? If indentation? is required, align-associative? can rely on the first column already being aligned.

I can't really imagine someone using align-associative? without indentation? but my imagination is only so big.

Feb 22 '19 19:02 lread

Sorry, why would that matter? Even if the indentation is wrong, that doesn't affect how the forms are aligned, does it?

Feb 22 '19 21:02 weavejester

I think you might have answered my question with your question? The two options should stand on their own?

What I was thinking was... let's say we have:

{:one :two
         :three :four
               :five :six}

indentation? alone does the work of aligning the first column:

{:one :two 
 :three :four
 :five :six}

If the work of aligning the first column is already always done, align-associative? does not need to deal with it. But.. if the two options can stand on their own, align-associative? will need to align the first column.

Feb 22 '19 21:02 lread

Ah, I see. I think that it should work independently, as in theory you could have:

{  :one :two 
  :three :four}

And have it aligned:

{  :one   :two 
   :three :four}

In other words, we can use the indentation of the first element as a guide. If indentation is turned on, then this will be at the correct position.

Feb 22 '19 21:02 weavejester

Sounds good, thanks!

Feb 22 '19 21:02 lread

@weavejester, I'm still slowly plugging away! It might be a while before I submit a pull request, but I will get there.

Your sample code snippet above implied that whitespace should not be removed between elements.

This...

{:one :two
 :three    :four}

...would reformat to

{:one      :two
 :three    :four}

...but wouldn't this be preferable?

{:one   :two
 :three :four}

When you have a moment, I am interested to learn your preference for cljfmt.

Feb 25 '19 23:02 lread

Still working on this. I'll look at adding rewrite-clj subzip, subedit-node and postwalk to rewrite-cljs as my change makes use of these features. My current alternative is to include this support in cljfmt, but it seems sad not to just add the support to rewrite-cljs, so I'll give it a go.

Mar 07 '19 22:03 lread

Any updates on this?

Jul 10 '19 14:07 AntonioGabrielAndrade

Yes @AntonioGabrielAndrade, I do plan do come back to this after I release a cljc version of rewrite-clj.

Jul 10 '19 14:07 lread

Following up on the discussion about how to align N elements. How about a rule something like:

If there are more than one consecutive line with the same number of elements, they get aligned as a group, otherwise no alignment happens.

If we give @weavejester's example to this it would align the first two lines as group, then the third line would be left unaligned, and then line 4 and 5 would be aligned as a group. Like so:

[:one   :two
 :three :four
 :five
 :six  :seven :eight
 :nine :ten   :eleven
 
 :one-hundred-fifty   :ten    :fifteen
 :five-hundred        :twenty :twentyfive
 :one-thousand-fiifty :thirty :thirtyfive
 :foo :bar
 :a   :b]

Also in this example, an empty line would effectively reset the the alignment grouping. So line 7, 8, 9 would get different alignment than line 4 and 5, even though they have the same N elements. And then line 10 and 11 would get their alignment as a group.

Mar 11 '20 07:03 PEZ