rustfmt icon indicating copy to clipboard operation
rustfmt copied to clipboard

Fix idempotency issue with raw strings

Open shulaoda opened this issue 3 months ago • 1 comments

Fixes #6161

Summary

The issue was in the CharClasses iterator's raw string parsing logic (src/comment.rs:1358-1372). When encountering the closing " of a zero-hash raw string (r"..."), the code incorrectly set char_kind to Normal before transitioning the state machine to CharClassesStatus::Normal.

This caused the closing quote to be classified as FullCodeCharKind::Normal instead of FullCodeCharKind::InString. The LineClasses iterator downstream depends on accurate character classification to determine string boundaries. Misclassifying the closing quote led to incorrect string boundary detection, which cascaded into wrong indentation calculations during macro formatting.

More Details: Why the Bug Only Affects Specific Cases

Example 1: Closing quote " at line start (❌ Bug triggers)

fn f() {
    my_macro! {
        m =>
        "a": r"bb
                    ccc
",  // ← Line starts with " (closing quote)
    };
}

What happens in LineClasses for the line ",:

// Step 1: peek() sees " as first character
// Bug: CharClasses marked it as Normal instead of InString
start_kind = FullCodeCharKind::Normal  // ❌ Wrong!

// Step 2: When reaching \n
match (start_kind, kind) {
    (FullCodeCharKind::InString, FullCodeCharKind::Normal) => {
        FullCodeCharKind::EndString
    }
    _ => kind,  // ❌ Matches here! start_kind is Normal
}

// Result: Line is marked as Normal ❌
// Impact: In trim_left_preserve_layout, this line is treated as regular code,
//         included in minimum indent calculation.
//
//         Since " is at line start, prefix_space_width = 0
//         → min_prefix_space_width = 0 (always!)
//         → new_indent_width = indent.width() + original_indent_width - 0
//         → Each formatting adds original_indent_width, causing infinite growth!

Example 2: Content character c at line start (✓ Works despite bug)

fn f() {
    my_macro! {
        m =>
        "a": r"bb
                    ccc
c",  // ← Line starts with c (string content)
    };
}

What happens in LineClasses for the line c",:

// Step 1: peek() sees c as first character
// Correct: CharClasses marked it as InString
start_kind = FullCodeCharKind::InString  // ✓ Correct!

// Step 2: When reaching \n (kind = Normal after reading ", which has bug)
match (start_kind, kind) {
    (FullCodeCharKind::InString, FullCodeCharKind::Normal) => {
        FullCodeCharKind::EndString  // ✓ Matches here!
    }
    _ => kind,
}

// Result: Line is marked as EndString ✓
// Impact: In trim_left_preserve_layout, this line is correctly excluded
//         from indent calculation → formatting is idempotent!

Reference

https://github.com/rust-lang/rustfmt/blob/50a49e7b3f769bcbc862603b97646b670e22706f/src/comment.rs#L1531-L1558 https://github.com/rust-lang/rustfmt/blob/50a49e7b3f769bcbc862603b97646b670e22706f/src/utils.rs#L615-L625

shulaoda avatar Oct 10 '25 21:10 shulaoda

I’m not very familiar with the codebase, so there might be a better way to handle this.

shulaoda avatar Oct 10 '25 21:10 shulaoda