tree-sitter-python
tree-sitter-python copied to clipboard
Incorrect parsing of (some) expressions within f-string interpolations
[Sorry for the long post, thought best to give as much detail on this one as possible given it's quite niche]
The problem
In the following code snippet, both the if statement's test and the f-string's interpolation body are parsed as named_expression:
if x := 5:
f"{x := 5}"
Named expressions (and, incidentally, lambda expressions) aren't allowed inside interpolations (from the python docs here):
Expressions in formatted string literals are treated like regular Python expressions surrounded by parentheses, with a few exceptions. An empty expression is not allowed, and both lambda and assignment expressions := must be surrounded by explicit parentheses.
The second x := 5 should be interpreted as an interpolation with an identifier on the left and a format specifier (=5) on the right.
This is definitely an edge-case bug, but took me a day to identify and kind-of fix.
Possible fix
I have this working in a fork, but not sure it's the best approach.
Interpolations can't contain named expressions, nor can they contain, for example, boolean operations which themselves contain named expressions. For example, the following is illegal:
f"{a := 3 or b := 0}"
So, for every type of expression that can appear in an interpolation (comparison, not operator, boolean operator, await, conditional expressions, and all the "primary_expressions", create an interpolation-friendly duplicate, which can only themselves contain interpolation-friendly expressions.
_interpolation_expression: $ => choice(
$.comparison_operator,
alias($._interpolation_not_operator, $.not_operator),
alias($._interpolation_boolean_operator, $.boolean_operator),
alias($._interpolation_await, $.await),
$.primary_expression,
alias($._interpolation_conditional_expression, $.conditional_expression)
),
interpolation: $ => seq(
'{',
$._interpolation_expression,
optional('='),
optional($.type_conversion),
optional($.format_specifier),
'}'
),
with, for example, the interpolation not operator defined as
_interpolation_not_operator: $ => prec(PREC.not, seq(
'not',
field('argument', $._interpolation_expression)
)),
PR welcome, as long as it doesn't increase state count too much. Tree-Sitter is permissive on purpose for that reason
also this
edit: ignore the red dot its a mistake in the image
also this
edit: ignore the red dot its a mistake in the image
this is an unclear example
