google-java-format Comment line wrapping is not idempotent

Here's a test case non-idempotent.zip

Bisecting seems to implicate 79a39179f8c "Wrap line comments that exceed the column limit" by @cushon, which makes sense to me as a culprit.

Oct 12 '17 19:10 novalis

Good catch.

input to output generates column-aligned // comment lines:

public class Morx {

  public String toString() {
    return SomeThingy.toStringHelper(this)
        .add("blort", blort)
        .add(
            "more-blort",
            moreBlort) // with a long-winded line-comment which makes the line than a hundred or so
                       // characters
        .add("morx", fleem)
        .toString();
  }
}

output to output removes the spaces used to align the comment lines:

public class Morx {

  public String toString() {
    return SomeThingy.toStringHelper(this)
        .add("blort", blort)
        .add(
            "more-blort",
            moreBlort) // with a long-winded line-comment which makes the line than a hundred or so
        // characters
        .add("morx", fleem)
        .toString();
  }
}

Oct 13 '17 08:10 sormuras

Note that this is only an issue for line comments that appear at the end of a non-blank line, and that exceed the column limit. It was a known issue when 79a39179f8cb0e54b0191ce1eeb39985ec03de4f was added, but the idea was that staying within the column limit was more important than the idempotency bug.

@novalis can you expand on your concerns from #214?

Oct 16 '17 18:10 cushon

We run google-java-format at build time; we fail a build if there are any changes between the formatted code and the code that is being built. This is similar to the new dry-run mode, but because our system predates that mode, we do it manually. If a user "fixes" the error by manually running google-java-format over their code (or if their pre-commit hook does so), we definitely don't want to then fail their build due to a formatting error that they just supposedly fixed!

Because older versions of google-java-format didn't wrap these comments, we have a fair amount of existing code which has non-wrapped comments. So when I went to upgrade google-java-format (which I wanted to do because of #208), I discovered that, despite running the new formatter over everything, my build failed. I could, I guess, run the new formatter twice over everything, but I'm reluctant to do that for a couple of reasons:

A fair amount of our code is not yet opted into the formatter, and this non-idempotence will make it hard for that code to opt in, since users will have to know to run the formatter twice.
After the second run, the formatting is, to my eye, unpleasant. I get some amount of grief from users about formatting anyway (as you can imagine, programmers are sometimes picky about what their code looks like, and don't want weird standards to be imposed on them). I don't want to put in formatting that I know is weird, because it is likely to cause more complaints.

Oct 16 '17 18:10 novalis

BTW, I know that formatting code is among the hardest jobs on earth, and I respect you immensely for doing it and helping us do it. I know that there are always tradeoffs, and that it's not always obvious what those tradeoffs are going to be, especially when you can't test on our codebase. So this is just to explain why we find this situation difficult, and why it's blocking our upgrade. We appreciate the hard work you put into this software.

Oct 16 '17 18:10 novalis

Thanks for the context. Do you have an estimate for how often these kind of comments appear in your code base? My impression had been that they're uncommon enough that running the formatter twice in the rare cases where this comes up might be tolerable. If you're seeing a large number of them, maybe we should turn this off for now.

And the issue with enforcing entire files match and then upgrading to a new version is why we only run the formatter on changed lines. If you want to check entire files you're signing up for some amount of churn between releases.

What would your preferred formatting for that example be? I agree that the line comment reflowing sometimes produces weird looking results, but getting lint errors after running the formatter is annoying, and it's always possible to adjust the comments by hand. I'd probably move the comment to its own line:

        // with a long-winded line-comment which makes the line than a hundred or so characters
        .add("more-blort", moreBlort)
        .add("morx", fleem)

Instead of:

        .add(
            "more-blort",
            moreBlort) // with a long-winded line-comment which makes the line than a hundred or so
        // characters
        .add("morx", fleem)

Oct 16 '17 19:10 cushon

Only running the formatter on changed files would be smart, but our build system isn't currently smart enough to do that (and it's not easy to make it smart enough). We're in the process of migrating to Bazel, which presumably is smart enough but it will be some time before that's a real solution for us.

I'm disinclined to manually reformat the files because it's a lot of work, and since I don't know the context, I'm worried about making errors. That said, I guess it might be my best option, so I'll look at how bad it would be.

I think my preferred formatting is:

        .add(
            "more-blort",
            moreBlort) // with a long-winded line-comment which makes the line than a hundred or so
                       // characters
        .add("morx", fleem)

But I recognize that it might be hard for the formatter to distinguish between the cases "long line-comment" and "also, there's a comment on the next line", which would look like:

        .add(
            "more-blort",
            moreBlort) // long line-comment on adding more-blort which is long
        // comment on .add("morx", "fleem")
        .add("morx", fleem)

I haven't really examined the code to even see if that distinction is feasible. But if it is, that would be the dream. In the case where the current format is not either of those the two, I don't really care what the formatter does; in the case where the line is too long, it should prefer the former.

Oct 16 '17 19:10 novalis

That sort of horizontal alignment is gently discouraged (at least not encouraged) by the style guide: https://google.github.io/styleguide/javaguide.html#s4.6.3-horizontal-alignment

We were trying to avoid having the formatter introduce lint errors. Previously it would sometimes increase the indentation of nested code enough that an existing comment was pushed over the column limit. Anecdotally there were more complaints about the formatter introducing those lint errors than there have been about idempotency.

That said, I guess it might be my best option, so I'll look at how bad it would be.

I don't want this to prevent you from upgrading, so let me know how that goes.

I guess we could always add a flag to disable comment reflowing...

Oct 16 '17 20:10 cushon

I thought about it, and decided that it didn't really matter whether it would be easy for me to manually fix the existing cases, because our users are likely to occasionally create new cases, and if they do so, they'll get bitten by the non-idempotency and complain.

I agree that the style guide's reasoning in 4.6.3, in general, is sound. But given that reasoning, maybe the style guide should also forbid //-at-end-of-non-blank-line comments followed by //-comments entirely? Sensibly formatting them will create a "blast radius", and formatting them in in an oblivious fashion (as g-j-f does today) risks reformatting a comment such that it appears to be attached to the wrong line. Since there's no correct way to handle them, they should be banned.

Disabling comment reflow would be solve my immediate problem, but I'm worried about how that will interact with e.g. the intellij internal (xml file) formatter. Or Eclipse, since we still have a few Eclipse die-hards.

Oct 16 '17 21:10 novalis

our users are likely to occasionally create new cases, and if they do so, they'll get bitten by the non-idempotency and complain.

I'm trying to weigh that against users complaining about lint errors from line comments that exceed the column limit. Obviously it would be better if the formatter could wrap line comments idempotently, but I'm not seeing any simple fixes for the underlying bug right now.

I'm worried about how that will interact with e.g. the intellij internal (xml file) formatter.

Any additional flags we add probably aren't going to show up in the GJF IntelliJ plugin. How does this relate to XML formatting?

Oct 16 '17 21:10 cushon

There are two ways to enforce google style with with intellij: 1. This project (and its plugin), and 2. https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml

Last time I checked, some of our users use one, and some use the other. I know that the latter is does not have perfect fidelity, but I think some versions of intellij like it better (I don't really know the details, because worrying about it hasn't reached the top of my stack yet and isn't likely to until we have many more complaints). Anyway, having an additional option makes me worry about more fidelity issues with users who use the xml file. But maybe they're already doomed enough that it doesn't matter.

Having the flag not appear in the GJF Intellij plugin seems somewhat painful, since then our users would have no way of correctly formatting from within their IDEs.

Oct 16 '17 21:10 novalis

I think the idempotent (but somewhat evil) way to wrap these comments is to move them entirely to the next line. Of course, this is not what the user intended, but hopefully they will look at the diff and correctly understand that the formatter simply forbids too-long end-of-line comments. Ideally, this would involve a patch to https://github.com/google/styleguide (to explain why this sort of comment is forbidden).

Oct 16 '17 22:10 novalis

I remembered that this change already shipped in version 1.4. I want to find a solution to this issue, but I don't think that needs to block the next release.

I think the idempotent (but somewhat evil) way to wrap these comments is to move them entirely to the next line.

Pushing too-long line comments to the next line would be OK with me, although I'm not sure it's any easier to implement than fixing the current approach. Changing the style guide is out of scope for this bug.

For what it's worth, this issue also affects the way we run the formatter internally as a presubmit check. I'm sure people have encountered it, but none of them have filed bugs. I can't say the same about when the formatter wasn't wrapping line comments. So we think this is a positive for most of our users.

Oct 18 '17 08:10 cushon

When you say "fixing the current approach", I'm not sure what you mean. If you want to wrap these lines, you have to put the excess somewhere. You don't want to use whitespace for alignment, and you probably don't want to just do what the current formatter does when run twice, because it could appear to run into another // comment and that would be bad. Maybe that really is what you prefer though.

Oct 18 '17 20:10 novalis

I meant providing the behaviour that you'd currently see from running the formatter twice, but on the first run:

        .add(
            "more-blort",
            moreBlort) // with a long-winded line-comment which makes the line than a hundred or so
        // characters
        .add("morx", fleem)

I'm saying it's sometimes preferable to a lint error, not that it's the best possible formatting.

Oct 18 '17 21:10 cushon

I guess that would solve my central problem.

Oct 18 '17 22:10 novalis

Pushing too-long line comments to the next line would be OK with me, although I'm not sure it's any easier to implement than fixing the current approach. Changing the style guide is out of scope for this bug.

I am concerned that this would lead to confusion. A comment before a line means a different thing than one after the line. I think of an end-of-line comment as a before-line comment that is formatted to save space. (I get the feeling this is the more common view, but I don't have evidence to back it up.)

I agree with @cushon's suggestion for achieving idempotence, given other constraints.

Jun 22 '18 19:06 mernst

I was just going to file a bug report on this issue. The non-idempotency causes spurious linting issues for our developers and they don't necessarily know that it needs to run again to fix it. "But I ran the formatter!"

Jun 05 '19 14:06 davidburstromspotify

@cushon is there any update on this issue? @novalis can correct me if I am wrong, but I think we would be fine with a flag disabling comment reflow (though not ideal)

Jul 06 '19 01:07 petroseskinder

Hm, I guess that would work. Basically, anything that would give us an idempotent run.

On July 5, 2019 9:30:28 PM EDT, petros [email protected] wrote:

@cushon is there any update on this issue? @novalis can correct me if I am wrong, but I think we would be fine with a flag disabling comment reflow (though not ideal)

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/google/google-java-format/issues/211#issuecomment-508887568

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Jul 06 '19 23:07 novalis

Any updates? This is very annoying.

Mar 05 '20 21:03 mkurz

google-java-format google-java-format copied to clipboard

Comment line wrapping is not idempotent

google-java-format
google-java-format copied to clipboard