tree-sitter-diff icon indicating copy to clipboard operation
tree-sitter-diff copied to clipboard

feat: split command filename to old/new file

Open thatlittleboy opened this issue 2 years ago • 3 comments

This PR is mostly motivated by the following problem:

  1. The diff command has two filenames, but the parser is currently parsing everything after diff --git as a single filename node, which is wrong.
  2. This results in different semantic interpretations during highlighting, whereas ideally it should have the exact semantic meaning. I propose it should be diff --git (old_file) (new_file) so that these filenames get the same highlighting/semantic meaning as the ones in the diff output, --- (old_file) and +++ (new_file).

thatlittleboy avatar Jan 01 '23 15:01 thatlittleboy

On this input (taken from ur playground)

diff --git a/.gitmodules b/.gitmodules
index d5bd61c9e..422671b4e 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -174,3 +174,7 @@
 	path = helix-syntax/languages/tree-sitter-git-commit
 	url = https://github.com/the-mikedavis/tree-sitter-git-commit.git
 	shallow = true
+[submodule "helix-syntax/languages/tree-sitter-git-diff"]
+	path = helix-syntax/languages/tree-sitter-git-diff
+	url = https://github.com/the-mikedavis/tree-sitter-git-diff.git
+	shallow = true

the query output is now

a.diff
  pattern: 4
    capture: 4 - variable.builtin, start: (0, 0), end: (0, 38), text: `diff --git a/.gitmodules b/.gitmodules`
  pattern: 1
    capture: 1 - keyword, start: (0, 11), end: (0, 24), text: `a/.gitmodules`
  pattern: 0
    capture: 0 - string, start: (0, 25), end: (0, 38), text: `b/.gitmodules`
  pattern: 2
    capture: 2 - constant, start: (1, 6), end: (1, 15), text: `d5bd61c9e`
  pattern: 2
    capture: 2 - constant, start: (1, 17), end: (1, 26), text: `422671b4e`
  pattern: 1
    capture: 1 - keyword, start: (2, 0), end: (2, 17), text: `--- a/.gitmodules`
  pattern: 0
    capture: 0 - string, start: (3, 0), end: (3, 17), text: `+++ b/.gitmodules`
  pattern: 3
    capture: 3 - attribute, start: (4, 0), end: (4, 19), text: `@@ -174,3 +174,7 @@`
  pattern: 0
    capture: 0 - string, start: (8, 0), end: (8, 58), text: `+[submodule "helix-syntax/languages/tree-sitter-git-diff"]`
  pattern: 0
    capture: 0 - string, start: (9, 0), end: (9, 52), text: `+  path = helix-syntax/languages/tree-sitter-git-diff`
  pattern: 0
    capture: 0 - string, start: (10, 0), end: (10, 65), text: `+    url = https://github.com/the-mikedavis/tree-sitter-git-diff.git`
  pattern: 0
    capture: 0 - string, start: (11, 0), end: (11, 16), text: `+    shallow = true`

notice that the a/.gitmodules and b/.gitmodules from the diff --git a/.gitmodules b/.gitmodules is being picked up by the query now. And they have respectively identical captures with the --- a/.gitmodules and +++ b/.gitmodules

thatlittleboy avatar Jan 01 '23 15:01 thatlittleboy

I was interested in adding this but it's not straightforward if you have filenames with spaces in them:

diff --git a/a b.txt b/a b.txt
index 86e041d..46add00 100644
--- a/a b.txt   
+++ b/a b.txt   
@@ -1,3 +1,3 @@
 foo
-bar
+baz
 baz

On this branch:

$ tree-sitter parse f.diff
(source [0, 0] - [9, 0]
  (command [0, 0] - [0, 20]
    (old_file [0, 11] - [0, 14])
    (new_file [0, 15] - [0, 20]))
  (ERROR [0, 21] - [0, 30]
    (ERROR [0, 21] - [0, 30]))
  (index [1, 0] - [1, 29]
    (commit [1, 6] - [1, 13])
    (commit [1, 15] - [1, 22])
    (mode [1, 23] - [1, 29]))
  (old_file [2, 0] - [2, 7]
    (filename [2, 4] - [2, 7]))
  (ERROR [2, 8] - [2, 13]
    (ERROR [2, 8] - [2, 13]))
  (new_file [3, 0] - [3, 7]
    (filename [3, 4] - [3, 7]))
  (ERROR [3, 8] - [3, 13]
    (ERROR [3, 8] - [3, 13]))
  (location [4, 0] - [4, 15]
    (linerange [4, 3] - [4, 7])
    (linerange [4, 8] - [4, 12]))
  (context [5, 0] - [5, 4])
  (deletion [6, 0] - [6, 4])
  (addition [7, 0] - [7, 4])
  (context [8, 0] - [8, 4]))
f.diff	0 ms	(ERROR [0, 21] - [0, 30])

the-mikedavis avatar Jan 02 '23 15:01 the-mikedavis

I see, that's a good point. Let me think about this and revisit this when I have a solution.

thatlittleboy avatar Jan 02 '23 16:01 thatlittleboy