symex.el icon indicating copy to clipboard operation
symex.el copied to clipboard

Tree sitter: transformation support

Open polaris64 opened this issue 2 years ago • 11 comments

Summary of Changes

I have started work on implementing the transformations in symex-transformations.el with tree sitter.

As with symex-primitives I have started moving the existing transformations to a new symex-transformations-lisp.el module and I have created a new symex-transformations-ts.el module for the new implementations. Once a transformation is implemented, I modify the appropriate function in symex-transformations.el to dispatch to the correct implementation based on the value of tree-sitter-mode.

This is a work-in-progress! Only a few transformations have been implemented so far (see the list below). The purpose of this PR is therefore to track this work until the point at which it is deemed complete enough to merge to symex-ts-integration.

Here are the transformations that I believe will need to be implemented; ticked items have been implemented in this PR: -

  • [X] append after node
  • [ ] capture node
  • [X] change node forward
  • [ ] clear node
  • [ ] comment node
  • [X] delete node forward/backward
  • [ ] delete remaining nodes
  • [ ] emit node
  • [X] insert at beginning/end of node
  • [X] insert before node
  • [ ] join node
  • [X] open line before/after node
  • [X] paste node before/after
  • [ ] replace node
  • [ ] shift forward/backward node
  • [ ] splice node
  • [ ] split node
  • [ ] swallow node
  • [ ] wrap node
  • [X] yank node
  • [ ] yank remaining nodes

Public Domain Dedication

  • [X] In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

polaris64 avatar Jun 20 '22 12:06 polaris64

all of these seem to work from initial inspection, but will keep playing with them. :D

tommy-mor avatar Jun 20 '22 21:06 tommy-mor

it might be time to add a symex-global-mode, or at least one that applies to any tree-sitter buffer

tommy-mor avatar Jun 20 '22 21:06 tommy-mor

@tommy-mor What would you like to see in such a mode? If it is just to prevent Lisp-style addition of spaces, for now we could do something like:

(when (member major-mode symex-lisp-modes
  (symex--ensure-minor-mode))

... in symex-initialize, to prevent the minor mode from triggering. You're probably right that such a mode would be needed - would be good to have some concrete reasons before introducing one.

countvajhula avatar Jun 21 '22 07:06 countvajhula

Looking good so far! Some observations:

  • deleting a symex selects the parent expression instead of a neighboring expression
  • Paste isn't context sensitive. E.g. for multi-line expressions or expressions spanning a full line, it should add a newline, and for single-line expressions, it should add a space. Ideally, for an expression like (|arg1, arg2), it would be great if yp results in (arg1, |arg1, arg2). But I think this is out of scope for now, and we should evaluate tree-edit for these things before attempting them. So for now, this should add the space, e.g. (arg1 arg1, arg2) so that the user can manually add a comma if they like.
  • Undo (u) leaves things in a weird state, and you need to go down/up to resolve it

countvajhula avatar Jun 21 '22 19:06 countvajhula

@countvajhula Thanks for the feedback!

  • deleting a symex selects the parent expression instead of a neighboring expression
  • Undo (u) leaves things in a weird state, and you need to go down/up to resolve it

Yes there are a couple of issues like this. I need to think of a way to select a new node reliably after every transformation.

  • Paste isn't context sensitive. E.g. for multi-line expressions or expressions spanning a full line, it should add a newline, and for single-line expressions, it should add a space [...]

As you mentioned I think tree-edit is probably going to be the solution to many of these problems. However I agree that there are probably some simple things we can do here to improve it somewhat. I'll work on adding the criteria you mentioned.

polaris64 avatar Jun 22 '22 14:06 polaris64

FYI @polaris64 and @tommy-mor the evil-repeat (dot) work has landed 😄 .

Would probably be good to rebase the integration branch soon to level up - let me know whenever you'd like to do that @polaris64 . The tree-sitter transformations should automatically gain the dot functionality once that happens, since the repeat recording happens at the level of the public transformation interfaces and not the primitive layer (cf. symex--evil-repeatable-commands in #46 ).

countvajhula avatar Jul 05 '22 17:07 countvajhula

@countvajhula woohoo! That's very exciting!

I think it's a good idea to rebase the integration work as it will have to function with evil-repeat anyway. I haven't checked the diff yet so I'm not sure how much it conflicts with the TS changes made thus far, but if these are fairly minimal then I'd say go ahead.

If they are more in depth then we'll probably have to rebase the current integration work and then I can start again on the transformation work. I haven't done a great deal so far, so applying the patches onto a new branch shouldn't be too much trouble.

polaris64 avatar Jul 10 '22 14:07 polaris64

@polaris64 Ok I've rebased the symex-ts-integration branch. Locally, when I ran git rebase symex-ts-integration with your branch checked out, it seemed to do it without any conflicts. But let me know if you run into any difficulties, happy to help!

countvajhula avatar Jul 10 '22 20:07 countvajhula

@countvajhula, sorry to sound like a broken record, but apologies for the lack of recent activity again. I have a lot of things going on both at home and at work so I have to use any spare moments I get as they arise!

In any case, I've rebased this PR onto your upstream symex-ts-integration branch and I've continued with the work this evening. I've made some small tweaks to the paste transformations in order to try to match the functionality you detailed above. It's not perfect and needs testing but it's an improvement. I'm hoping to be able to continue with the other transformations so we can at least get the rest of the list ticked off, then we can hopefully work on improving the user experience.

polaris64 avatar Aug 14 '22 19:08 polaris64

@polaris64 Yeah that sounds like a good way of going about it - checking off the list first and then tweaking the experience later. In fact at that stage we can probably get the help of users to test it (I think @tommy-mor is always up for trying things out!).

Also, by way of signposting -- and this doesn't necessarily affect our current trajectory (unless you believe it should) -- but in the light of #26 , I'm thinking that eventually symex-ts could probably be split into a separate package. My current plan / recommendation would be to start an organization account and eventually host all the symex-related packages there, along with other public domain packages like Qi. I started an account for this purpose here: https://github.com/drym-org although there's nothing there at the moment. For now we can continue adding the feature directly into Symex as we'd originally planned, though.

My further plan along these lines is to eventually start some kind of crowdfunding account for the drym-org github account, and then distribute any contributions via an attribution-based scheme, an early prototype of what's described at drym.org. (In reality, attribution should not be restricted to any particular account or initiative or group, but this could a rudimentary start that, with any luck, will prove the concept for broader adoption).

I'm not sure when I'd get around to setting this up, but I wanted to give you an early heads up since it may involve some logistical stuff like moving repos around and whatnot, and I don't want that process to annoy you any more than necessary 😄

countvajhula avatar Aug 15 '22 19:08 countvajhula

I'm not sure when I'd get around to setting this up, but I wanted to give you an early heads up since it may involve some logistical stuff like moving repos around and whatnot, and I don't want that process to annoy you any more than necessary

This all sounds fine to me, don't worry! :) I'm happy contributing what I can to the project so I don't mind how the project is managed. In terms of moving repos around, that's fine too.

I've started reading your article on drym.org and it sounds interesting, so I'll be sure to read it more thoroughly when I get time.

Thanks for the heads-up!

polaris64 avatar Aug 21 '22 19:08 polaris64

@countvajhula I've just rebased this PR onto the latest symex-ts-integration commit. I added a fix due to the recent changes to the overlay and improved node deletion slightly.

Perhaps you could test this PR a bit and see if you feel it's ready to merge? We can then create a branch for the remaining transformations.

polaris64 avatar Aug 27 '22 20:08 polaris64

Looking great @polaris64 ! One issue that seemed to crop up in different ways was about selection. After entering text, or after doing a transformation of some kind, entering symex mode sometimes selected an unexpected expression. I suspect this can be solved at the level of symex-ts--handle-tree-modification and symex-ts--adjust-point and probably doesn't need changes in the transformations themselves.

Full results of my testing:

  • Append after (A) complains symex-append-after: Symbol’s function definition is void: symex-ts-append-after
  • Deleting at the end of any expression causes the root to be selected after the deletion instead of (ideally) the preceding node or containing node.
  • delete-remaining seems to work, even though it isn't checked off in the description above
  • entering symex mode while at the end of a word always causes it to select the containing expression instead of the one at point. E.g.:
def blah(hello):
    print("hello")
    a = 5|
    print("ciao")

With point at |, entering symex mode causes the entire body of the blah function to be selected instead of either 5 or a = 5. But placing the cursor at any other point within that expression, e.g. a = |5, causes it to select correctly.

  • yank-remaining seems to work even though it isn't checked off in the list
  • parenthesis behavior in TS is Lisp-/paredit-like (e.g. adds a space), and I'm not sure if that's what we want for non-Lisp languages. Should it not balance at all? Or should it balance but not add a space? We'd probably need to use it more to get a sense of the expected UX. Once we have a clearer idea about this, we will need to review whether we need separate minor modes for TS/Lisp, as @tommy-mor was suggesting earlier.

And yeah, I think the PR is fine to merge as is since it is introducing new things and doesn't break existing things. We can continue with the above fixes / remaining transformations in bite-sized follow-up PRs. I'll transfer the outstanding items directly into the integration PR description (or maybe try out GitHub Projects for this? I'll give it a look). Thanks!

countvajhula avatar Aug 30 '22 18:08 countvajhula