tree-sitter-javascript icon indicating copy to clipboard operation
tree-sitter-javascript copied to clipboard

JSXText node trims whitespace too often

Open jackschu opened this issue 1 year ago • 0 comments

The following piece of code is valid but it is parsed incorrectly:

<div> <div/></div>

Here's a link to the TypeScript Playground showing that the snippet above is valid JavaScript or TypeScript:

https://www.typescriptlang.org/play/?#code/DwEwlgbgfABKkHorAeaQ

The output of tree-sitter parse is the following:

program [0, 0] - [1, 0]
  expression_statement [0, 0] - [0, 18]
    jsx_element [0, 0] - [0, 18]
      open_tag: jsx_opening_element [0, 0] - [0, 5]
        name: identifier [0, 1] - [0, 4]
      jsx_self_closing_element [0, 6] - [0, 12]
        name: identifier [0, 7] - [0, 10]
      close_tag: jsx_closing_element [0, 12] - [0, 18]
        name: identifier [0, 14] - [0, 17]

This is wrong because there is no jsx_text node but as seen in the ts playground link tsc will output

"use strict";
React.createElement("div", null,
    " ",
    React.createElement("div", null));

Note that " " will end up inside a dome node, so I think there should be a jsx_text node here.

This is not limited to an empty string ie

<div> foo </div>

will result in a dom node with the text " foo " , so if I query for 'is there a jsx_text node with text === ' foo ' tree-sitter-javascript will tell me 'no' but i think the answer should be 'yes'


Advice / pointers

There is a lot of un-specified nuance here. Ie the whitespace rules are not part of the JSX spec (https://github.com/facebook/jsx/issues/143), but are implemented by React (https://github.com/facebook/jsx/issues/40)

Basically, i think the jsx_text node should grow to encapsulate the rules around whitespace rather than trimming always.

Relevant: #227

jackschu avatar Jun 28 '24 04:06 jackschu