nvim-surround icon indicating copy to clipboard operation
nvim-surround copied to clipboard

Custom deletions based on patterns/Tree-sitter

Open Neelfrost opened this issue 2 years ago • 30 comments

Checklist

  • [x] Have you read through :h nvim-surround to see if there might be any relevant information there?

Describe the solution you'd like

  • print('hello') > dsf > 'hello'
  • print('hello') > csf > promt > pprint('hello')

Neelfrost avatar Jul 13 '22 17:07 Neelfrost

I would like to keep this plugin as minimal as possible, so maybe it would be more sensible for advanced users to define their own operator maps? Maybe something similar to #52 would work...

kylechui avatar Jul 13 '22 17:07 kylechui

Came here to suggest this but you got here first!

theol0403 avatar Jul 13 '22 17:07 theol0403

@kylechui just an idea - currently you allow custom functions to add pairs. Could you allow this for deletion also? This way, precisely how the deletion is done becomes a problem for the implementor of that function (sure, you could maybe provide some defaults if you felt generous/interested, but optimizing it can become a collaborative problem).

Maybe if you modified the existing function signature so it's something like:

           ["i"] = function(opts)
                if opts.adding then {
                  return {
                      require("nvim-surround.utils").get_input(
                          "Enter the left delimiter: "
                      ),
                      require("nvim-surround.utils").get_input(
                          "Enter the right delimiter: "
                      )
                  }
                }
                elseif opts.deleting then {
                  -- implement some custom logic to find function surroundings and delete them. 
                  -- Maybe you have this function returns the row, col boundaries so that *you* (nvim-surround) do the actual deleting - that way you can also get `opts.changing` for free/more easily
                }
            end,

andrewferrier avatar Jul 13 '22 17:07 andrewferrier

(With the above approach, existing custom callbacks would still work for adding. They might fail if you used them for deleting. Alternative more backwards-compat suggestion:

["i"] = {
            {
              mode = 'adding',
              callback = function(opts)
                return {
                      require("nvim-surround.utils").get_input(
                          "Enter the left delimiter: "
                      ),
                      require("nvim-surround.utils").get_input(
                          "Enter the right delimiter: "
                      )
                  }
                }
             },
             {
             mode = 'deleting',
              callback = function(opts)
                  -- implement some custom logic to find function surroundings and delete them. 
                  -- Maybe you have this function returns the row, col boundaries so that *you* (nvim-surround) do the actual deleting
                }
             }
         }

(pseudo-code is ugly but you get the idea!)

This way you can check if ["i"] maps to an object or function and behave accordingly.

andrewferrier avatar Jul 13 '22 17:07 andrewferrier

This seems like a doable/reasonable idea---I was originally going to complain about having to use vim.fn.searchpos() and vim.fn.searchpairpos(), but if the user provides the selections then we could be on to something here...

As for pseudocode it's probably just easier to just use keys into the table, e.g.

["i"] = {
    adding = --Current implementation
    deleting = function()
    end,
}

kylechui avatar Jul 13 '22 17:07 kylechui

I definitely think trying to delete a "function" as a generalized function will be quite hard to get right, since in what language and what does "a function" look like generally is a pretty hard problem. I don't think taking on the burden of writing endless treesitter queries per language is sustainable either, as is evidenced by all the plugins that have done the same only to hit the roadblock that is defining queries for languages you'd never even heard of.

That being said, this seems like potentially pretty cool functionality that maybe treesitter could help with. I'm not sure if directly depending on treesitter text objects will work as you've seen yourself with recent breakage there, this makes the implementation here brittle. I wonder if a middle ground in what @andrewferrier is suggesting could be allowing users to specify queries to determine what a function is for whatever language and what is the body of the function, so then this plugin can derive the start i.e. beginning of the node range → start of body up till the end of body → end of node range.

I think a generic function where a user will have to provide the ranges themselves will be quite convoluted to implement for the user. I can't imagine what a simple version of such a function would be, that a user would quickly think to add in their config without spending hours on it.

akinsho avatar Jul 13 '22 17:07 akinsho

@kylechui, I was just trying to be considerate of anyone who had already defined a custom callback function directly under:

["x"] = function()
    -- something only to do with adding
end

But if you're going to do this, now might be the time to break that, since I doubt (no offence!) many folks have invested a lot of effort in that.

andrewferrier avatar Jul 13 '22 17:07 andrewferrier

@akinsho I 100% agree, which is why my original plan was to ignore non-textobject cases and focus only on extending functionality via defining new textobjects. I opened #52 a few days ago to get a sense for whether or not this might be possible, as I'm pretty sure the difference between g@a[char] and g@i[char] would yield the surrounding pair for deletion/modification. I've already tested this a little bit on branch modify-custom-textobjects, and it seems to work in theory for HTML tags, instead of my old method of "manually searching" for the corresponding <, > characters after getting the "big" selection via g@at.

Edit: Indeed this issue would be compounded by the fact that different languages presumably have different ways for calling functions, and a callback would need to be setup per buffer/filetype, which would be quite a pain.

kylechui avatar Jul 13 '22 17:07 kylechui

I definitely think trying to delete a "function" as a generalized function will be quite hard to get right, since in what language and what does "a function" look like generally is a pretty hard problem. I don't think taking on the burden of writing endless treesitter queries per language is sustainable either, as is evidenced by all the plugins that have done the same only to hit the roadblock that is defining queries for languages you'd never even heard of.

That being said, this seems like potentially pretty cool functionality that maybe treesitter could help with. I'm not sure if directly depending on treesitter text objects will work as you've seen yourself with recent breakage there, this makes the implementation here brittle. I wonder if a middle ground in what @andrewferrier is suggesting could be allowing users to specify queries to determine what a function is for whatever language and what is the body of the function, so then this plugin can derive the start i.e. beginning of the node range → start of body up till the end of body → end of node range.

I think a generic function where a user will have to provide the ranges themselves will be quite convoluted to implement for the user. I can't imagine what a simple version of such a function would be, that a user would quickly think to add in their config without spending hours on it.

Just to clarify; I'm not really focusing on the function surround (f) specifically here. That's just an example (although I acknowledge it was the original point of this issue, which I've hijacked slightly, sorry!). The point is that with custom callbacks that define the boundaries of the surrounding, anything can potentially be deleted/changed. I agree this would be for more advanced users - but I guess I'm hoping there might evolve a wiki/collection/other plugins that build on top of this one that do it.

I do agree with you that defining functions in the general case would be a hard problem, and doing that well would probably need to use treesitter.

andrewferrier avatar Jul 13 '22 17:07 andrewferrier

@kylechui As an aside, I suggest it might be less confusing if you stop referring to pairs as 'text objects' ;) Unless I'm misunderstanding you, they are not text objects - at least not in the Vim sense (in Vim, a text object is anything you can use after an operator). I like the word 'surround', personally, but up to you...

(at least, I think that's what you're saying - maybe I'm getting confused now...)

andrewferrier avatar Jul 13 '22 18:07 andrewferrier

@andrewferrier I'm not referring to the surrounds themselves when I say "textobject"---I'm referring to how textobjects are used (by both my plugin and vim-surround) to actually discern the location of the surrounds. For example, to find the nearest set of parentheses, it's analogous to first running va( and then taking only the pairs from the ends, where a( is the textobject.

Edit: This is the reason why more complex deletions/changes would require quite a bit more effort to refactor, even in the case of #61; va* isn't a builtin textobject so ds*/cs*[char] cannot work under the current implementation

kylechui avatar Jul 13 '22 18:07 kylechui

OK got it, my bad. Thanks for clarifying :)

andrewferrier avatar Jul 13 '22 18:07 andrewferrier

No worries! I forget that not everybody has been staring through the code/implementation of this project for as long as I have, so I'm glad to clarify any questions about why certain things can/cannot be done easily

kylechui avatar Jul 13 '22 18:07 kylechui

the difference between g@a[char] and g@i[char]

Hmm, this is interesting because I guess if you get this working then you don't even need to care how the text object was created or if it uses treesitter. I mean, whether treesitter text objects works will be a moot point because how a text object is defined is no longer your problem since all this plugin will do is remove outside vs inside. If you're computing this dynamically i.e. per every call would it wouldn't need filetype specific logic, would it?

To me, it seems like you'd found quite a nice solution, not really following on why it's proving problematic? Is this specific to nvim treesitter text objects being broken?

akinsho avatar Jul 13 '22 18:07 akinsho

@akinsho Exactly, the point was that this would provide an easy way to implement modification of "dynamic-length" surrounds without needing to overhaul the code. The main issue stopping me from continuing is that nvim-treesitter-textobjects is broken (or so I think). But I think this method should work for non-treesitter defined textobjects, but I only really have HTML-style tags to go off of.

Edit: The downside to this is that "less popular" surrounds like { "func_name(", ")" } aren't defined as textobjects by anything (to my knowledge), so I'm not sure this could be handled by this method without the user having to define these textobjects via searchpairpos() themselves.

Edit 2: This is @call.outer and @call.inner

kylechui avatar Jul 13 '22 18:07 kylechui

@kylechui there are other plugins that do stuff like treesitter text objects you can maybe test with such as https://github.com/RRethy/nvim-treesitter-textsubjects or https://github.com/David-Kunz/treesitter-unit. Also, you could go back in their history to see if there's a commit that does what you want 🤷🏿‍♂️.

Re. the "func_name(" I'd describe this as a "function_call" not a function and that would be a different piece of functionality to provide. It's worth noting I looked at this for some other reason a while ago and not all TS parsers give a coherent result for what a "function_call" is. You could always do what https://github.com/Matt-A-Bennett/vim-surround-funk does but I've frankly never looked at the code, so it could be horrible to get working. Andrew Radev has another solution which works better but has fewer features called https://github.com/AndrewRadev/dsf.vim

akinsho avatar Jul 13 '22 18:07 akinsho

@akinsho Thanks for the info about those two plugins, and for the "function call" name. I think I have a decent idea for how I could implement it via searchpairpos(), but at the cost of "polluting" the user's operator maps. To me, at least, I'm at somewhat of a crossroads between a few different ways of proceeding:

  • Swap to my g@a[char], g@i[char] method
    • Much easier to implement
    • Might not be as extensible down the line (e.g. function calls might have to be treated as a special case, along with other surrounds that people might want)
    • Can conflict with existing omaps, e.g. what if someone is using nvim-treesitter-textobjects (which provides if, af) but then wants to use dsf for function calls, not functions?
  • Add another, new "parser" for finding surroundings given some start/end pattern (done with searchpairpos())
    • For functions and function calls, this can vary from language to language, requiring a buffer-local setup per filetype
  • Give the user a callback function and let them "figure it out", before returning the selections to delete

The final option seems like the worst of all three, as a feature that's hard to use is effectively no feature at all. I would have more confidence in the first if I could ensure that its issues had clearer workarounds. The second seems like the most "complete", but also results in more configuration that has to be done by the end user.

On second thought, function calls probably will have to be handled as a special case akin to HTML tags, as when you are "changing" them you probably mean just the name, or args, etc.

Edit: The more I think about it, the more I'm realizing that I'll probably end up using searchpairpos() for creating the omaps to begin with for method 1, so I might as well just use a blend of 1 and 2 to find the selections, e.g. using textobjects, but if none is found then use method 2, e.g. use Lua pattern matching with searchpairpos() to find the left/right delimiters.

kylechui avatar Jul 13 '22 18:07 kylechui

A blend of 1,2 sounds good/ideal, but in any case I think as long as your API is stable-ish you don't really have to worry about having to change tac internally at some point in the future I'd say. It's also fine to limit your scope and decide not to support certain things if it makes the plugin more stable in the long run IMO, but seems like you've got a way forward anyway

akinsho avatar Jul 17 '22 08:07 akinsho

While I generally agree with that sentiment, I think this feature (deleting by pattern) is probably something that not only most people want, but also something that would ease the "surround model" of nvim-surround that people have built in their heads. In other words, it's a lot easier to use this plugin if everything "just works" the way you expect it to, rather than having to remember all the various quirks and weird design choices offered by adding surrounds vs. changing/deleting, etc. One of my major goals with this project is to have this feel as intuitive as possible for those that have never used a surround plugin before, while maintaining extensibility for those that want it.

kylechui avatar Jul 17 '22 18:07 kylechui

Yeah definitely in support of having this functionality, I've been through a few delete surrounding function call plugins over the years It would be nice to have a robust solution.

akinsho avatar Jul 17 '22 21:07 akinsho

If you're interested in trying it out, I think mini.surround does have this functionality built-in already.

kylechui avatar Jul 17 '22 21:07 kylechui

My current solution works 85% of the time, so is alright for now

akinsho avatar Jul 17 '22 22:07 akinsho

Back on the modify-custom-textobjects grind. Hopefully I can figure out all the indexing problems I'm having right now

kylechui avatar Jul 20 '22 03:07 kylechui

Latest commit seems to pass all test cases locally; dsf also seems to work for Lua files but csf deletes the entire function which is strange. image

Doubly strange is that it seems to highlight the function "surround" properly.

Edit: Found the reason why; it's a combination of calling utils.get_nearest_selections() twice (for dot-repeatability) and the fact that vif does not forward-search past the current line that are causing the issue.

Edit 2: Came up with a "hack" for it, but it probably isn't robust.

kylechui avatar Jul 20 '22 03:07 kylechui

TIL @call.outer and @call.inner are things; it almost works for them! @akinsho It looks like the way that treesitter-textobjects define function calls includes the parentheses for @call.inner, so it raises a bit of a weird edge case. @Neelfrost I might consider an explicit "hard-coded" case for this, e.g. similar to HTML tags since I don't want the user to have to type csff instead of csf, analogous to how cst triggers the change tag mapping. I think treesitter-textobjects is still slightly broken (nobody has responded to my issue), but this commit is hopefully usable. TBH all of this complexity makes it seem like I might as well just code the pattern-matching, since it feels like there's not that much benefit to using textobjects at the moment :/

kylechui avatar Jul 20 '22 04:07 kylechui

@kylechui that's a shame what exactly is the result when you say it includes the parens do you mean it would delete the function name but leave the parentheses around the args? I wonder if that can then be worked around by doing a remove brackets call internally?

akinsho avatar Jul 21 '22 09:07 akinsho

@akinsho That's exactly it; another remove brackets call could be done but then at this point I feel like theres not amy benefit over using patterns to delete the function call

kylechui avatar Jul 21 '22 18:07 kylechui

If it's easier then that makes sense, I guess the patterns for a function don't vary that wildly except for things like lisps, but then you'd just use the brackets removal in those case for those uses 🤷🏿

akinsho avatar Jul 23 '22 12:07 akinsho

@akinsho I've been trying new things out on my pattern-matching branch (see #101) and this configuration seems to work for lisp-style functions (regular functions already work, to my knowledge):

["f"] = {
    add = function()
        return { { "(" .. vim.fn.input("Enter a function name: ") .. " " }, { ")" } }
    end,
    find = "%b()",
    delete = "^(.-%s+)().-(%))()$",
    change = {
        target = "^%(([^%s]+)().-()()$",
        replacement = function()
            return { { vim.fn.input("Enter the function name: ") }, { "" } }
        end,
    },
},

I think removing just the parentheses by themselves isn't ideal, as the function name would be left over. This pattern-based recognition deletes the function names when dsf is run, and changes only the function name when csf is run.

I'll be working on a more general way to modify surrounds with functions (so the user can use Tree-sitter, etc.) instead of just using patterns.

kylechui avatar Jul 25 '22 23:07 kylechui

Just tried it out and seems to work quite nicely ❤️, the cursor position doesn't seem to respect the move_cursor = false option though but that's an extremely minor issue since tbh not sure what I'd have expected since function is now gone, so the structure of the line might change too much

akinsho avatar Jul 26 '22 13:07 akinsho

This has finally been finished; v1.0.0 has been released and any further discussion about it can be done in #120. Thanks to everybody for their support and patience while I was ironing out all the bugs :)

kylechui avatar Aug 06 '22 03:08 kylechui

Great work! :) Just what I needed!

Neelfrost avatar Aug 14 '22 03:08 Neelfrost