ruff
ruff copied to clipboard
Convert raw strings to non-raw when fixes add escape sequences (#13294)
Summary
This aims to resolve #13294 by implementing raw to non-raw string conversion, as suggested by @dscorbett. The conversion is comprehensive (as far as I'm aware), but it does introduce an unfortunate amount of complexity to the rule.
I'm new to this codebase and not a Rust expert, so my code might not be idiomatic.
Test Plan
I have added the following test cases:
raw_single_singlequote = r'\ \' " '
raw_triple_singlequote = r'''\ ' " '''
raw_single_doublequote = r"\ ' \" "
raw_triple_doublequote = r"""\ ' " �"""
raw_single_singlequote_multiline = r'\' \
" \
'
raw_triple_singlequote_multiline = r'''' \
" \
'''
raw_single_doublequote_multiline = r"' \
\" \
"
raw_triple_doublequote_multiline = r"""' \
" \
"""
raw_nested_fstrings = rf'\ {rf'\ {rf'\' '}'}'
ruff-ecosystem results
Linter (stable)
✅ ecosystem check detected no linter changes.
Linter (preview)
✅ ecosystem check detected no linter changes.
Formatter (stable)
✅ ecosystem check detected no format changes.
Formatter (preview)
✅ ecosystem check detected no format changes.
It would also be good to test triple-quoted strings whose final characters match the outer quotation marks:
r"""\""""
r'''\''''
The raw_nested_fstrings tests don’t include any of the relevant control characters so they are not effective tests for these rules’ fixes.
Good points, thank you!
My previous example was incomplete: quotation marks in triple-quoted strings need escaping when they precede two more of the same quotation mark. This can also happen in the middle of a string:
r"""\"""..."""
r'''\'''...'''
In this case, being able to fix 1-2 raw strings doesn't justify the added complexity.
I kind of agree… imho, the fix should be marked unsafe for raw strings. I can open a new PR for that if preferred.
Regardless, this was a good excuse to learn Rust better :)
I think the rules should either offer safe fixes for raw strings or no fixes. Keeping the current fixes but marking them unsafe doesn’t seem useful, because they are incorrect.
I agree. We should not offer fixes if we know they're incorrect. We could consider offering fixes for raw-strings if they don't contain any quotes or backslashes.
Sorry for leaving this hanging. I've intended to scrap this work in favour of not offering to fix raw strings, but life got in the way. Happy to revise this next month.
No worries and thanks for the heads up. There are no expectations at all. I only requested changes to make it clear that this PR isn't waiting on me.
@ThatsJustCheesy would you like me to close this PR so you can make a new one with the simpler "skip it" heuristic? That way, too, if we ever decide to revisit fixing raw strings, we could revive this branch and its logic more easily. What do you think?
@ThatsJustCheesy I'm gonna go ahead and close this for now, since it sounds like the direction has changed. But please don't hesitate to ping me if you'd like me to re-open it!