vim icon indicating copy to clipboard operation
vim copied to clipboard

refactor vim indent plugin to support Vim9 script

Open lacygoill opened this issue 3 years ago • 31 comments

This is an attempt at improving the indentation for Vim9 script, and fix the issues mentioned in https://github.com/vim/vim/issues/11023

It should not be merged as it is; it still needs to be worked on. But it's a start.

I don't know how well it works for legacy Vim scripts.


One thing which would help is for Vim to introduce a v:indent dictionary variable in which we could get and save some values. It would only be set while an indentation is being performed. It could give us a few useful information (e.g. shiftwidth(), has('syntax_items')). But more importantly, it would let us remember some previous state. For example, if we find a heredoc on a given line, we would do:

v:indent.heredoc = {
    lnum: v:lnum,
    startindent: indent(v:lnum),
    endmarker: getline(v:lnum)->matchstr('some regex'),
}

Then, on subsequent lines, we could inspect the value of v:indent.heredoc to know whether we're in a heredoc, and if so how to indent.

Right now, I do it with an ad-hoc buffer-local variable: b:vimindent_heredoc. But I need to delete it, which is not easy. If an error is given during an indentation (or C-c is pressed), then the deletion might fail.

The alternative is to inspect the syntax, which is not reliable, because it depends on the syntax being enabled and being able to parse all heredocs (which is tricky; I have yet to find the right incantation of syntax sync commands to achieve that).

lacygoill avatar Sep 07 '22 20:09 lacygoill

Codecov Report

Merging #11079 (4dd787f) into master (6f14da1) will increase coverage by 0.03%. The diff coverage is n/a.

:exclamation: Current head 4dd787f differs from pull request most recent head 42dec82. Consider uploading reports for the commit 42dec82 to get more accurate results

@@            Coverage Diff             @@
##           master   #11079      +/-   ##
==========================================
+ Coverage   81.74%   81.78%   +0.03%     
==========================================
  Files         162      162              
  Lines      188149   188441     +292     
  Branches    42814    42850      +36     
==========================================
+ Hits       153802   154115     +313     
+ Misses      21821    21784      -37     
- Partials    12526    12542      +16     
Flag Coverage Δ
huge-clang-none 82.68% <ø> (+0.01%) :arrow_up:
huge-gcc-none 53.73% <ø> (-0.80%) :arrow_down:
huge-gcc-testgui 52.16% <ø> (-0.79%) :arrow_down:
huge-gcc-unittests 0.29% <ø> (-0.01%) :arrow_down:
linux 82.44% <ø> (+<0.01%) :arrow_up:
mingw-x64-HUGE 76.46% <ø> (+76.46%) :arrow_up:
mingw-x64-HUGE-gui ?
mingw-x86-HUGE 77.33% <ø> (?)
windows 78.12% <ø> (+1.15%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/iscygpty.c 0.00% <0.00%> (-30.56%) :arrow_down:
src/vim9script.c 88.75% <0.00%> (-1.61%) :arrow_down:
src/vim9instr.c 81.55% <0.00%> (-0.76%) :arrow_down:
src/vim9execute.c 90.77% <0.00%> (-0.60%) :arrow_down:
src/vim9cmds.c 85.70% <0.00%> (-0.44%) :arrow_down:
src/netbeans.c 72.30% <0.00%> (-0.37%) :arrow_down:
src/regexp_nfa.c 89.50% <0.00%> (-0.23%) :arrow_down:
src/session.c 64.93% <0.00%> (-0.16%) :arrow_down:
src/main.c 83.97% <0.00%> (-0.15%) :arrow_down:
src/popupwin.c 86.68% <0.00%> (-0.14%) :arrow_down:
... and 63 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Sep 11 '22 04:09 codecov[bot]

I think it's good enough now.

I fixed all the issues I could find, and added more tests.


Regarding the existing tests, I changed a few. This one:

let a =<< END
nothing
END

Was indented like this:

let a =<< END
    nothing
END

That's wrong. Without the trim argument, leading whitespace is semantic, not cosmetic. If we indent nothing, we add spaces inside the value assigned to the a variable. Indenting the code should not change the value of a variable.

The new test distinguishes 2 cases; depending on whether trim is used. The new plugin leaves this block unchanged:

let a =<< END
nothing
END

OTOH, it does indent the body of this heredoc:

let a =<< trim END
    nothing
END

This other test:

func Some()
     " next-line
let f = x
endfunc

Was indented like this:

func Some()
     " next-line
     let f = x
endfunc

IMO, that's wrong. The code should not care about how a comment is indented. A comment is meaningless for the machine. And here, the indent of the comment is wrong (it should be 4 not 5). There is no reason for the code to be aligned with a wrongly indented comment. This is inconsistent and not persistent, because =ip will fix both the comment and the code anyway.

Besides, I had difficulties handling comments. The indent expression became simpler and more reliable as soon as I ignored comments when computing the indent of a line of code. So, this is not something which I intend to fix/change. Even if I wanted to, I don't think I could. To compute the indent of a comment which is not preceded by another comment, the new indent expression re-calls itself to eagerly compute the indent of the next line of code. Because a comment should be indented like the next line of code (see here). So, if a line of code cares about its comment, we might get a circular definition: the indent of a line of code depends on its comment above, which depends on the code below, which depends on the comment above, which ...


The new plugin fixes various issues, including this one:

if v:true
  echo 0
  end

Notice how end is wrongly indented. It should be:

if v:true
  echo 0
end

The new plugin fixes that.


Heredocs were not properly handled. For example:

var message: list<string> =<< trim END
    js-beautify is not installed, but you refer to it in your 'formatprg' option!

    Installation:

        $ npm install --global js-beautify

    Documentation:
    https://github.com/beautify-web/js-beautify
END

The plugin indents this assignment like so:

var message: list<string> =<< trim END
js-beautify is not installed, but you refer to it in your 'formatprg' option!

Installation:

$ npm install --global js-beautify

Documentation:
https://github.com/beautify-web/js-beautify
END

That's wrong. Now, when message will be printed, the $ npm install ... line will no longer be indented. The new plugin correctly handles that; it does not add/remove new leading whitespace inside the assigned value.


I don't know how well the plugin works against legacy code. I just tested it on the matchit autoload script (which is quite long), and didn't spot any obvious mistake. It certainly gives better results than the current plugin. With the latter, the indentation gets wrong early on in the script.

lacygoill avatar Sep 11 '22 04:09 lacygoill

The plugin indents this assignment like so:

That must be caused by some local config. Without config, it's indented like this instead:

vim9script
var message: list<string> =<< trim END
        js-beautify is not installed, but you refer to it in your 'formatprg' option!

        Installation:

        $ npm install --global js-beautify

        Documentation:
        https://github.com/beautify-web/js-beautify
END

But it doesn't matter; it's still wrong.

lacygoill avatar Sep 11 '22 04:09 lacygoill

There are things which are too tricky to handle.

Here, the trailing slash looks like an arithmetic division, while it's just a delimiter:

substitute/pat / rep /
                     ^

Here, the trailing paren looks like a grouping operator, while it's just a key:

nunmap <buffer> (
                ^

Here, the trailing percent looks like a modulo operator, while it's just the current file:

edit %
     ^

This affects the indent of the next line. To reliably handle all of these, treesitter might help.

lacygoill avatar Sep 14 '22 11:09 lacygoill

Some more which are tricky:

mark <
mark [

< is not a comparison operator, and [ does not open a list.

lacygoill avatar Sep 14 '22 12:09 lacygoill

Is it now good enough to include? I'm sure there will be further updates later, we can't wait until it is perfect.

brammool avatar Sep 14 '22 14:09 brammool

Is it now good enough to include?

I think it's good enough.

lacygoill avatar Sep 14 '22 14:09 lacygoill

Is it now good enough to include?

I think it's good enough.

Do you mind listing yourself as the maintainer?

-- "Thou shalt not follow the Null Pointer, for at its end Chaos and Madness lie."

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 14 '22 16:09 brammool

Not sure I can maintain the script, but we'll see.

lacygoill avatar Sep 15 '22 02:09 lacygoill

Thank you for doing this! I'm curious, why aren't you putting the function into an autoload script and use import autoload '../autoload/dist/vim.vim' (or any other autoload directory)? Currently, the indent function is always sourced on startup which isn't necessary.

bfrg avatar Sep 15 '22 11:09 bfrg

It's only loaded if Vim is passed a Vim script file, and if the indent feature is enabled. Otherwise, it should not be, so I thought that the gain was negligible. That said, it doesn't hurt.

lacygoill avatar Sep 15 '22 11:09 lacygoill

I noticed something a long time ago, but never asked so far: we can't use fullcommand() on abbreviated control flow keywords in Vim9 context:

:vim9cmd echo fullcommand('en')
E1065: Command cannot be shortened: en

Is it working as intended?

I understand the error, but it's not obvious. Vim doesn't want to give :endif, because if it did, it would implicitly acknowledge the fact that :endif can be shortened into :en which is disallowed in Vim9 context. But that's an implicit reason. We're not explicitly asking to shorten :endif into :en. So, the message might be hard to understand for some users.

Anyway, I think it would be convenient if no error was given. Otherwise, we need a legacy function just to be able to call fullcommand(); that's what the new Vim indent plugin has to do. And since legacy code is slower, especially in an indent expression which is called many times, it might help to lift this restriction.

lacygoill avatar Sep 15 '22 11:09 lacygoill

One more tricky bug to fix. Expected:

vim9script
if end == 'xxx' || end == 'yyy'
    echo
endif

Actual:

vim9script
if end == 'xxx' || end == 'yyy'
echo
endif

I'm working on it.

lacygoill avatar Sep 15 '22 12:09 lacygoill

Not sure I can maintain the script, but we'll see.

You are doing very well so far :-).

In many cases it would mean reviewing suggested changes and including them. Users who run into problems will know Vim script!

-- ASCII stupid question, get a stupid ANSI.

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 15 '22 12:09 brammool

I noticed something a long time ago, but never asked so far: we can't use fullcommand() on abbreviated control flow keywords in Vim9 context:

:vim9cmd echo fullcommand('en')
E1065: Command cannot be shortened: en

Is it working as intended?

I understand the error, but it's not obvious. Vim doesn't want to give :endif, because if it did, it would implicitly acknowledge the fact that :endif can be shortened into :en which is disallowed in Vim9 context. But that's an implicit reason. We're not explicitly asking to shorten :endif into :en. So, the message might be hard to understand for some users.

Anyway, I think it would be convenient if no error was given. Otherwise, we need a legacy function just to be able to call fullcommand(); that's what the new Vim indent plugin has to do. And since legacy code is slower, especially in an indent expression which is called many times, it might help to lift this restriction.

Agreed, it should just return an empty string, not give an error message.

To make it work in Vim9 script context, you can use ":legacy". Is that really much slower?

-- hundred-and-one symptoms of being an internet addict: 83. Batteries in the TV remote now last for months.

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 15 '22 12:09 brammool

To make it work in Vim9 script context, you can use ":legacy". Is that really much slower?

In theory yes, but not in this case (nor where I found the issue originally). With :legacy, we can't access local variables:

vim9script
def Func(kwd: string)
    echo fullcommand(kwd)
enddef
Func('en')
E1065: Command cannot be shortened: en
vim9script
def Func(kwd: string)
    legacy echo fullcommand(kwd)
enddef
Func('en')
E121: Undefined variable: kwd
E116: Invalid arguments for function fullcommand(kwd)

So, we need a legacy function:

vim9script
def Func(kwd: string)
    echo FullCommand(kwd)
enddef
function FullCommand(kwd)
    return fullcommand(a:kwd)
endfunction
Func('en')
endif

lacygoill avatar Sep 15 '22 12:09 lacygoill

That said, I didn't make any measurement. It might not have a significant impact.

lacygoill avatar Sep 15 '22 12:09 lacygoill

One more tricky bug to fix.

Fixed in this commit

lacygoill avatar Sep 15 '22 12:09 lacygoill

To make it work in Vim9 script context, you can use ":legacy". Is that really much slower?

In theory yes, but not in this case (nor where I found the issue originally). With :legacy, we can't access local variables:

vim9script
def Func(kwd: string)
    echo fullcommand(kwd)
enddef
Func('en')
E1065: Command cannot be shortened: en
vim9script
def Func(kwd: string)
    legacy echo fullcommand(kwd)
enddef
Func('en')
E121: Undefined variable: kwd
E116: Invalid arguments for function fullcommand(kwd)

So, we need a legacy function:

vim9script
def Func(kwd: string)
    echo FullCommand(kwd)
enddef
function FullCommand(kwd)
    return fullcommand(a:kwd)
endfunction
Func('en')
endif

I see, prefixing :vim9cmd or :legacy makes it more complicated. I think it's not bad to add an optional argument:

fullcommand(cmd_name, is_vim9)

Also, should not give an error message.

-- If your nose runs, and your feet smell, you might be upside down.

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 15 '22 16:09 brammool

I need to fix more bugs. One of them would be easier to fix if Vim did not behave in an unexpected way:

vim9script
var lines =<< trim END
    if true
        echo
    endif
END
lines->setline(1)
cursor(3, 1)
echomsg searchpair('\<if\>', '', '\<endif\>', 'bnW')
echomsg searchpair('\<if\>', '', '\<endif\>', 'bcnW')
1
0
vim9script
var lines =<< trim END
    if true
        echo
    endif
END
lines->setline(1)
cursor(1, 1)
echomsg searchpair('\<if\>', '', '\<endif\>', 'nW')
echomsg searchpair('\<if\>', '', '\<endif\>', 'cnW')
3
0

In both cases, I would expect searchpair() to find a match with and without the c flag. It seems that c can prevent a match from being found. How is that possible? c does not put any extra requirement. On the contrary, it makes the search more permissive by accepting a match where the cursor is.

It looks similar to issue https://github.com/vim/vim/issues/6735 which was fixed in patch 8.2.5152

lacygoill avatar Sep 15 '22 18:09 lacygoill

I need to fix more bugs. One of them would be easier to fix if Vim did not behave in an unexpected way:

vim9script
var lines =<< trim END
    if true
        echo
    endif
END
lines->setline(1)
cursor(3, 1)
echomsg searchpair('\<if\>', '', '\<endif\>', 'bnW')
echomsg searchpair('\<if\>', '', '\<endif\>', 'bcnW')
1
0
vim9script
var lines =<< trim END
    if true
        echo
    endif
END
lines->setline(1)
cursor(1, 1)
echomsg searchpair('\<if\>', '', '\<endif\>', 'nW')
echomsg searchpair('\<if\>', '', '\<endif\>', 'cnW')
3
0

In both cases, I would expect searchpair() to find a match with and without the c flag. It seems that c can prevent a match from being found. How is that possible? c does not put any extra requirement. On the contrary, it makes the search more permissive by accepting a match where the cursor is.

What happens is that when starting at "endif" and accepting that match, then it's like being after the "endif". It will search back for the second "if": 1 if anything 2 if true 3 echo 4 endif

Start at line 4, then it will result in "1".

-- hundred-and-one symptoms of being an internet addict: 89. In addition to your e-mail address being on your business cards you even have your own domain.

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 15 '22 20:09 brammool

Thanks a lot for taking care of Vim 9 script indentation! I have many instances of higher order functions like this:

def Select(Cont: func(func(any)), Pred: func(any): bool): func(func(any))
  return (Emit: func(any)) => {
    Cont((t: any) => {
      if Pred(t)
        Emit(t)
      endif
    })
    }  # <== This should be indented as `return`
enddef

Is something like this already taken care of?

lifepillar avatar Sep 17 '22 20:09 lifepillar

Is something like this already taken care of?

On my local copy, yes. By the current PR, I don't know.

I need to fix more bugs related to nested lists, dicts, blocks. I think it will take a few more days.


BTW, I have no idea why GitHub says that some tests are failing. I didn't touch the C source code at all. I looked at the logs, and tried to reproduce but couldn't. I think the only way I could reproduce is by disabling the VIMRUNTIME entirely (with something like HOME=/tmp...).

lacygoill avatar Sep 17 '22 20:09 lacygoill

Thanks a lot for taking care of Vim 9 script indentation! I have many instances of higher order functions like this:

def Select(Cont: func(func(any)), Pred: func(any): bool): func(func(any))
  return (Emit: func(any)) => {
    Cont((t: any) => {
      if Pred(t)
        Emit(t)
      endif
    })
    }  # <== This should be indented as `return`
enddef

Is something like this already taken care of?

My preference would be to have "}" aligned with the start of the line above only when it closes a block. In this case it ends an argument, I would prefer to give it some indent.

..def Select(Cont: func(func(any)), Pred: func(any): bool): func(func(any)) ....return (Emit: func(any)) => { ........Cont((t: any) => { ............if Pred(t) ..............Emit(t) ............endif ..........}) ......} ..enddef

Thus the block of the lambda has twice the indent, the closing "}" one. Can probably do the same when the "}" ends a dictionary:

....var d = { ........key: 'value', ......}

(replacing the spaces with dots to make sure the don't get messed up in transit)

-- From "know your smileys": :-& Eating spaghetti

/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \
/// \
\\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///

brammool avatar Sep 17 '22 21:09 brammool

I ran gg=G vim started with command /usr/local/bin/vim --clean light.vim

This is with latest vim (patch 9.0.0487 commit acd6b9976b)

vim9script
g:lightline = {
        'active': {
                'left': [ [ 'mode', 'paste' ], [ 'readonly', 'relativepath', 'modified' ] ],
                },
                'inactive': {
                        'left': [ [ 'readonly', 'relativepath', 'modified' ] ],
                        }
                        }

This is with this pull request (commit 887d1164671)

vim9script
g:lightline = {
'active': {
'left': [ [ 'mode', 'paste' ], [ 'readonly', 'relativepath', 'modified' ] ],
},
'inactive': {
'left': [ [ 'readonly', 'relativepath', 'modified' ] ],
}
}

Ideally I expect

vim9script
g:lightline = {
        'active': {
                'left': [ [ 'mode', 'paste' ], [ 'readonly', 'relativepath', 'modified' ] ],
        },
        'inactive': {
                'left': [ [ 'readonly', 'relativepath', 'modified' ] ],
        }
}

balki avatar Sep 17 '22 21:09 balki

Ideally I expect

This is exactly what I have with my local copy. But I don't push it yet, because it has broken 5 tests. I need to fix them before pushing.

lacygoill avatar Sep 17 '22 21:09 lacygoill

Is something like this already taken care of?

The PR correctly handles this snippet now. It's also included among the tests.


Ideally I expect

Same thing.


My preference would be to have "}" aligned with the start of the line above only when it closes a block. In this case it ends an argument, I would prefer to give it some indent.

I added a config variable g:vim_indent to support this:

let g:vim_indent.more_in_bracket_block = v:true

It seems this indent style is better suited for shiftwidth=2. Starting from shiftwidth=4, I find that the code moves too far away from the start of the line. So the config variable defaults to v:false. If people prefer this style to be the default, I can change it.

Can probably do the same when the "}" ends a dictionary:

If we do that for }, we should probably do the same for ) and ]. The previous config variable does it for all types of blocks.

lacygoill avatar Sep 19 '22 17:09 lacygoill

Do we need to support this?

var ll = [[
    1,
    2,
    3], [
    4,
    5,
    6], [
    7,
    8,
    9]]

var ld = [{
    a: 1,
    b: 2}, {
    c: 3,
    d: 4}, {
    e: 5,
    f: 6}, {
    }]

I found this type of indentation in the killersheep plugin.

First, it's difficult to support.

Second, it seems to be an unconventional way of formatting the code. If we give it to the first typescript formatter suggested by Google, here is what we get in return:

var ll = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9],
];

var ld = [
  {
    a: 1,
    b: 2,
  },
  {
    c: 3,
    d: 4,
  },
  {
    e: 5,
    f: 6,
  },
  {},
];

One could argue that in some cases, the original formatting is better (like in the killersheep plugin, so that the sprites are all in the same column). But wouldn't heredocs be better? If not, would inline heredocs help (as requested in issue #8623)?

lacygoill avatar Sep 19 '22 17:09 lacygoill

While working on this, I've noticed that the first line in a heredoc assignment can be followed by an inline comment:

vim9script
var heredoc =<< trim END  # inline comment
    a
    b
    c
END
echo heredoc
['a', 'b', 'c']

But not by a bar:

vim9script
var heredoc =<< trim END  | echo 'd'
    a
    b
    c
END
echo heredoc
E488: Trailing characters:   | echo 'd'

OTOH, the last line cannot be followed by anything:

vim9script
var heredoc =<< trim END
    a
    b
    c
END  # inline comment
E990: Missing end marker 'END'
vim9script
var heredoc =<< trim END
    a
    b
    c
END  | echo 'd'
E990: Missing end marker 'END'

This seems inconsistent, and lets us write ambiguous code (e.g. the reader might wonder if an inline comment is actually part of the heredoc body). FWIW, I would prefer nothing to be allowed after the first and last line of a heredoc.


Same thing for a curly block:

vim9script
{  # inline comment
    echo 'a'
    echo 'b'
    echo 'c'
}
a
b
c
vim9script
{  | echo 'd'
    echo 'a'
    echo 'b'
    echo 'c'
}
d
a
b
c

Except that this time, we do allow a bar after the first line, which IMO adds to the confusion. Even more confusing, if the curly block is used in a lambda, we can add a bar, but any command which follows is ignored:

vim9script
timer_start(0, (_) => {  | echowindow 'd'
    echowindow 'a'
    echowindow 'b'
    echowindow 'c'
})

Disallowing everything after { would make the code easier to read, and more predictable (not to mention that the rule is simple to understand and remember, because no weird exception).

lacygoill avatar Sep 19 '22 17:09 lacygoill

Even more confusing, if the curly block is used in a lambda, we can add a bar, but any command which follows is ignored:

If it was not ignored, d would be echo'ed, but it's not.

lacygoill avatar Sep 19 '22 17:09 lacygoill