LuaSnip icon indicating copy to clipboard operation
LuaSnip copied to clipboard

Use neovim-parser for LSP-snippets.

Open L3MON4D3 opened this issue 2 years ago • 18 comments

As the neovim-parser is

  1. included in neovim (not really a plus, but not a negative, because we're not adding a dependency for the parser)
  2. already supports format/transform (regex-stuff) and
  3. more maintainable than the current one (which is very hack-ish :grimacing:)

we should switch to it.

Being able to create snippets from a more abstract AST is also useful for manipulating snippets returned by LSP, which is already done on the string-representation here, but it's simpler on an AST.

One downside of the neovim-parser is that it is a bit slower, but that should make no difference since lsp-snippets can all be parsed (and are, loaded ones at least) on expand, and the time for parsing one snippet should be negligible.

L3MON4D3 avatar May 25 '22 21:05 L3MON4D3

Maybe this is a good chance to also include an alternative extended-parser that handles back-ticks so we can finally add the functionality to the snipemate loader.

leiserfg avatar May 25 '22 22:05 leiserfg

Right, yes: I think we can just

  1. replace each occurrence of `...` with a custom Variable, like $__LUASNIP_SNIPMATE1
  2. Pass a table that maps the variable to vimscript-expressions
  3. When creating the luasnip-node for that ast-node, look into that table and create a function that evaluates and returns the vimscript-expression.

The stuff in backticks is only run on expansion, eg does not depend on other nodes, right?

L3MON4D3 avatar May 26 '22 05:05 L3MON4D3

@kmarius I've been looking into ways to glue together jsregexp and the AST returned by require("vim.lsp._snippet").parse. The parser already takes care of parsing the format-string into smaller bits, so we have

" ${1:/upcase}${2:/upcase} ${1:?if:else}" ->
{
	{
		esc = " ",
		raw = " ",
		type = 7,
	}, {
		capture_index = 1,
		modifier = "upcase",
		type = 6,
	}, {
		capture_index = 2,
		modifier = "upcase",
		type = 6,
	}, {
		esc = " ",
		raw = " ",
		type = 7,
	}, {
		capture_index = 1,
		else_text = "else",
		if_text = "if",
		type = 6,
	}
}

Could you adapt the current API so jsregexp accepts structured data instead of the plain string? (No worries if you're busy atm, I can also get to that later :) ) That seems nicer than undoing nvims parse and then doing another one in jsregexp :D

L3MON4D3 avatar May 26 '22 20:05 L3MON4D3

Changing the API is not a problem, is the parser output documented somewhere? The regex engine works with UTF16 and I haven't figured out how convert back and forth. Only ascii works currently (I think).

kmarius avatar May 27 '22 08:05 kmarius

Changing the API is not a problem, is the parser output documented somewhere?

No, but the code is pretty readable. You can try different inputs via

print(vim.inspect(require("vim.lsp._snippet").parse("${1/(.*)/.../}")))

Concerning unicode, I think ASCII only is alright for now :)

L3MON4D3 avatar May 27 '22 10:05 L3MON4D3

If it's more convenient, we can also transform the format-layout in lua, however you like (are lists more convenient than maps?)

L3MON4D3 avatar May 27 '22 10:05 L3MON4D3

I think it's better to keep the jsregex just taking care of exposing the captures of the regex and do the transformations in lua side, as it makes it easier to iterate (if something is broken we don't have to rebuild the c library)

leiserfg avatar May 27 '22 11:05 leiserfg

Hmm, yeah that's true. If that is in lua we could also allow user-defined modifiers (${1/inc_number}, for example), which would be pretty cool (the parser would have to be modified though)

L3MON4D3 avatar May 27 '22 11:05 L3MON4D3

Oooookay, this is finally almost done, I'll just need to do some good tests, then this should be good to go. Also: check performance, we're doing a lot more now. No difference for me, but should be checked properly. Also!: check what's going wrong with CI (probably something missing to compile jsregexp)

L3MON4D3 avatar Jul 18 '22 21:07 L3MON4D3

Ooookay, now, 25 days after "finally almost done", I think this is done for real. Some stats: Loading all of vim-snippets before: pre Loadin all of vim-snippets after: post All in all very nice, just an increase of ~2.4 to ~3.4 seconds for parsing, considering the slower parser and good amount of postprocessing (all worth it, snippets are much more accurate now) we're doing, this is pretty good. Both with and without this patch, expanding a single snippet takes <10ms. This is much more important since we don't parse snippets in bulk at any point (well, except if someone has a perverse number of parsed snippets), but rather parse them just before they're expanded (via snippetProxy). I might make a public request that users go and use this branch for a bit, maybe..

L3MON4D3 avatar Aug 12 '22 20:08 L3MON4D3

Using the branch for a few days, not having breakages yet, although my usage mainly boils down to using snippets provided by LSP

kuator avatar Aug 21 '22 11:08 kuator

I only just saw that you added jsregexp as a submodule. Just make sure you fix a tag or commit because there will be some breakage in master (notably, https://github.com/kmarius/jsregexp/pull/11). I am not planning a release on luarocks anytime soon though. The next version will probably include reg:exec(str) and reg:test(str) methods which should come handy for #547.

kmarius avatar Aug 24 '22 10:08 kmarius

Ah, thanks for looking out :+1: I think the submodule should be fixed as long as it's not explicitly set up to track a branch (which it isn't, so should work :D).

I am not planning a release on luarocks anytime soon though.

Ah, right, might've gotten nasty if it wasn't fixed (+regular breaking changes ofc)

__call -> exec? What can test do?

L3MON4D3 avatar Aug 24 '22 12:08 L3MON4D3

We are looking to mimic the javascript RegExp objects. exec returns the first (next if global) match in the string, test simply checks for a match without building the result. These are the only methods of RegExp, everything else, findAll, match, replace, etc. can be built from these (theoretically, at least) and are methods of the string type. __call will stay for compatibility (for a while, at least).

kmarius avatar Aug 24 '22 12:08 kmarius

Ahh, that makes sense. Yeah test could make sense for the triggers, but if it matches, we'll still need the captures (although, it will not match far more often than match, so if test has a performance-benefit over exec/__call that sounds pretty nice)

L3MON4D3 avatar Aug 24 '22 13:08 L3MON4D3

Just make sure you fix a tag or commit because there will be some breakage in master

git submodules always point to a commit, in this case: https://github.com/kmarius/jsregexp/tree/c3e473240eebb65a8870abebafeff83b6c9e7f16

leiserfg avatar Aug 24 '22 13:08 leiserfg

Yeah test could make sense for the triggers,

IMHO, it will be less performant because the regex has to calculate the groups when it's doing the match. And using a test first will require evaluating twice when there is a match.

leiserfg avatar Aug 24 '22 13:08 leiserfg

git submodules always point to a commit, in this case: https://github.com/kmarius/jsregexp/tree/c3e473240eebb65a8870abebafeff83b6c9e7f16

Ah, good to know.

IMHO, it will be less performant because the regex has to calculate the groups when it's doing the match. And using a test first will require evaluating twice when there is a match.

I think it might still be worth it since you could potentially be doing many tests that fail. Doing a single additional exec when test succeeds can't be that bad, no? (I have no idea about how much time is spent matching vs. creating the result).

Edit: Clearly I was not thinking right. If exec fails we are not creating a match anyway. test really only makes sense if you are not interested in what matched or what the groups are in any way.

kmarius avatar Aug 24 '22 14:08 kmarius

Alright, exec it is :D

L3MON4D3 avatar Aug 25 '22 23:08 L3MON4D3

Let's hope this goes smoothly :crossed_fingers:

L3MON4D3 avatar Aug 27 '22 18:08 L3MON4D3