ruff
ruff copied to clipboard
Preserve multiline strings in code generator
This is a draft PR for #7799 posted for discussion. Minimally tested with cargo dev round-trip.
Extending ruff_python_literal to know more about Python string literals (raw, triple, etc.) should be uncontroversial. Changes here are incomplete (no raw, universal newline, bytes...) but completing shouldn't be difficult.
I am less sure about ruff_python_codegen changes. AST doesn't have enough information. Instead of adding needed information (after all, it's AST not CST), in this PR Generator has an optional Locator. When Locator is available, string kind information is taken from original source and passed to ruff_python_literal. Behavior is unchanged if Locator is unavailable.
ruff-ecosystem results
Linter (stable)
✅ ecosystem check detected no linter changes.
Linter (preview)
✅ ecosystem check detected no linter changes.
@sanxiyn -- in https://github.com/astral-sh/ruff/pull/10298, I modified ruff's AST so that we now track many stylistic details in the AST that were previously not tracked:
- Whether a string is raw or not
- The quoting style (double or single quotes)
- Whether a string is triple-quoted or not
Would you be interested in updating your PR to use the information newly available in our AST? If not, I can pick this up and list you as a co-author
Using the Locator in the Generator is clever to avoid tracking additional data in the AST. However, it won't work for AST nodes with no corresponding element in the source (AST nodes that have been built or modified manually). We could dedect such nodes by testing if the range is 0..0, but I'm somewhat hesitant about making the Generator depend on the Locator.
I'm going to close this PR for now as @AlexWaygood is planning to take a different approach by leveraging the augmented AST.
Thanks @sanxiyn!