tree-sitter-python icon indicating copy to clipboard operation
tree-sitter-python copied to clipboard

Incorrect parsing of (some) expressions within f-string interpolations

Open bm424 opened this issue 3 years ago • 2 comments
trafficstars

[Sorry for the long post, thought best to give as much detail on this one as possible given it's quite niche]

The problem

In the following code snippet, both the if statement's test and the f-string's interpolation body are parsed as named_expression:

if x := 5:
    f"{x := 5}"

Named expressions (and, incidentally, lambda expressions) aren't allowed inside interpolations (from the python docs here):

Expressions in formatted string literals are treated like regular Python expressions surrounded by parentheses, with a few exceptions. An empty expression is not allowed, and both lambda and assignment expressions := must be surrounded by explicit parentheses.

The second x := 5 should be interpreted as an interpolation with an identifier on the left and a format specifier (=5) on the right.

This is definitely an edge-case bug, but took me a day to identify and kind-of fix.

Possible fix

I have this working in a fork, but not sure it's the best approach.

Interpolations can't contain named expressions, nor can they contain, for example, boolean operations which themselves contain named expressions. For example, the following is illegal:

f"{a := 3 or b := 0}"

So, for every type of expression that can appear in an interpolation (comparison, not operator, boolean operator, await, conditional expressions, and all the "primary_expressions", create an interpolation-friendly duplicate, which can only themselves contain interpolation-friendly expressions.

    _interpolation_expression: $ => choice(
      $.comparison_operator,
      alias($._interpolation_not_operator, $.not_operator),
      alias($._interpolation_boolean_operator, $.boolean_operator),
      alias($._interpolation_await, $.await),
      $.primary_expression,
      alias($._interpolation_conditional_expression, $.conditional_expression)
    ),

    interpolation: $ => seq(
      '{',
      $._interpolation_expression,
      optional('='),
      optional($.type_conversion),
      optional($.format_specifier),
      '}'
    ),

with, for example, the interpolation not operator defined as

    _interpolation_not_operator: $ => prec(PREC.not, seq(
      'not',
      field('argument', $._interpolation_expression)
    )),

bm424 avatar May 27 '22 09:05 bm424

PR welcome, as long as it doesn't increase state count too much. Tree-Sitter is permissive on purpose for that reason

amaanq avatar Aug 16 '23 03:08 amaanq

also this image

edit: ignore the red dot its a mistake in the image

johannesrld avatar Apr 11 '24 14:04 johannesrld

also this image

edit: ignore the red dot its a mistake in the image

this is an unclear example

amaanq avatar Apr 12 '24 01:04 amaanq