rustfmt icon indicating copy to clipboard operation
rustfmt copied to clipboard

Remove each comment's leading whitespace in `comment::rewrite_comment`

Open ytmimi opened this issue 3 years ago • 2 comments

Fixes #5391

Previously, leading whitespace at the start of custom comments (CommentStyle::Custom) would lead consume_same_line_comments to not return the correct byte position for the end of the comment.

This issue cascaded into a situation where an extra indented newline was inserted into rustfmts output causing a "left behind trailing whitespace" error.

Removing leading whitespace helps to properly identify the end of the comment and properly rewrite the comment.

ytmimi avatar Jun 20 '22 15:06 ytmimi

Just want to elaborate a little more on the issue that we're experiencing here, and why I think this is the approach we should take. For reference here's the code for identify_comment:

https://github.com/rust-lang/rustfmt/blob/3de1a095e0ed52ade1b88d165d44d80455d3e6c5/src/comment.rs#L269-L408

Taking the example from the original issue identify_comment gets called with orig: &str = " //.Y\n". when we compute the style (line 276) we determine that this is a CommentStyle::DoubleDash comment.

When hitting the match on line 321 we enter the first match arm because we just computed the style as CommentStyle::DoubleDash. This means we call consume_same_line_comments with style = CommentStyle::DoubleDash, orig = " //.Y\n", and line_start = "// ".

Now lets zoom in on the for loop in consume_same_line_comments: https://github.com/rust-lang/rustfmt/blob/3de1a095e0ed52ade1b88d165d44d80455d3e6c5/src/comment.rs#L305-L317

  • the if statement returns false since the trimmed line isn't empty
  • the else if statement returns false since "//.Y" doesn't start with "// ", and when we recompute the comment style we determine the trimmed comment to be CommentStyle::Custom.
  • Lastly we hit the else case and immediately break out of the loop and return (false, 0) from consume_same_line_comments.

Now on line 364 we split the original string at 0, which gives us ("", " //.Y\n"),

When we finally get to the last conditional statement at the end of identify_comment we end up hitting the else case which recursively calls identify_comment on the trimmed version of rest, which is "//.Y\n". When this recursive call succeeds we also add an additional indented newline.

In the case where the comment is the last item in a block this additional indented newline conflicts with the indented newline added on line 314 of visitor::FmtVisitor::close_block and leads to the "left behind trailing whitespace" error.

https://github.com/rust-lang/rustfmt/blob/2c8b3bef2bb5d708e047afc4d0dadf0a85528470/src/visitor.rs#L311-L323

I suspect something similar happens in the case where the comment isn't the last item in a block, but didn't look too deeply into it since removing leading whitespace up front resolves both cases.

In conclusion, by trimming the comment up front we save ourselves from doing additional work that would lead us to recursively call identify_comment with a trimmed version of the string anyway.

ytmimi avatar Jun 21 '22 19:06 ytmimi