cursorless initial nix language support

EDIT: I removed the original PR text, as no longer relevant and will have made it harder to review.

This PR adds nix language support.

Checklist

[x] Recorded tests for the new language
[x] Used "change" / "clear" instead of "take" for selection tests to make recorded tests easier to read
[x] Added a few specific tests that use "chuck" instead of "change" to test removal behaviour when it's interesting, especially:
- [x] "chuck arg" with single argument in list
- [x] "chuck arg" with multiple arguments in list
- [x] "chuck item" with single argument in list
- [x] "chuck item" with multiple arguments in list
[x] Added @textFragment captures. Usually you want to put these on comment and string nodes. This enables "take round" to work within comments and strings.
[x] Added a test for "change round" inside a string, eg "hello (there)"
[-] Supported "type" both for type annotations (eg foo: string) and declarations (eg interface Foo {}) (and added tests for this behaviour 😊)
[x] Supported "item" both for map pairs and list entries (with tests of course)

Supported	Tested	Term	Capture	Definition	Comment
✓	✓	`list`	`@list`	List type equivalent	-
✓	✓	`inside list`	`@list.interior`	Inside of a list	-
✓	✓	`map`	`@map`	Dictionary type equivalent	-
✓	✓	`inside map`	`@map.interior`	Inside of a dictionary	-
✓	✓	`key`	`@collectionKey`	Dictionary key equivalent	-
✓	✓	`funk`	`@namedFunction`	A named function declaration	-
✓	✓	`inside funk`	`@namedFunction.interior`	The inside of a lambda declaration	-
✓	✓	`funk name`	`@functionName`	Name of declared function	-
✓	✓	`lambda`	`@anonymousFunction`	A lambda declaration	-
✓	✓	`inside lambda`	`@anonymousFunction.interior`	The inside of a lambda declaration	-
✓	✓	`name`	`@name`	Variable name	-
✓	✓	`value`	`@value`	Right-hand-side value of an assignment	-
_	_	`value`	`@value`	Value returned from a function	-
✓	✓	`value`	`@value`	Value of a key-value pair	-
✓	✓	`state`	`@statement`	Any single coded statement	-
✓	✓	`if state`	`@ifStatement`	An if conditional block	-
✓	✓	`condition`	`@condition`	Condition of an if block	-
_	_	`condition`	`@condition`	Condition of a while loop	-
_	_	`condition`	`@condition`	Condition of a do while style loop	-
_	_	`condition`	`@condition`	Condition of a for loop	-
_	_	`condition`	`@condition`	Condition of a ternary expression	-
✓	✓	`branch`	`@branch`	The resulting code associated with a conditional expression	-
✓	✓	`inside branch`	`@branch.interior`	The statements associated with the body of a conditional expression	-
✓	✓	`comment`	`@comment`	Single line code comment	-
✓	✓	`comment`	`@comment`	Multi-line code comment	-
✓	✓	`string`	`@string`	Single line strings	-
✓	✓	`string`	`@string`	Multi-line strings	-
✓	✓	-	`@textFragment`	Used to capture string-type nodes (strings and comments)	-
✓	✓	`call`	`@functionCall`	A function call (not a function definition)	-
✓	✓	`callee`	`@functionCallee`	Name of the function being called	-
✓	✓	`arg`	`@argumentOrParameter`	Arguments to function definition	-
✓	✓	`arg`	`@argumentOrParameter`	Arguments to function calls	-
✓	✓	`arg`	`@argumentOrParameter`	Interpolated variables in strings	-
_	_	`class`	`@class`	Class or structure declaration	-
_	_	`inside class`	`@class.interior`	The inside of a class declaration	-
_	_	`class name`	`@className`	Name of class or structure declaration	-
_	_	`type`	`@type`	Type declarations	-

Checklist

[x] I have added tests
[ ] I have updated the docs and cheatsheet
[x] I have not broken the cheatsheet

Sep 24 '23 05:09 fidgetingbits

Ok thanks for the tests, though we prefer "change" instead of "take" because it makes the tests much easier to read. See the PR template. Also, "chuck" tests are only necessary in cases where "chuck" is interesting, eg removing commas, etc

If you don't want to re-record the "take" tests, you can just find and replace setSelection to clearAndSetSelection in your recorded tests, and then run the "update test fixtures" launch config, which should handle the rest for you

Sep 28 '23 12:09 pokey

I'd be happy to try this out on some real files once the corresponding parse-tree PR is merged.

Sep 28 '23 13:09 auscompgeek

One thing I can't entirely decide is if take lambda from the right most lambda should capture all curried lambdas as one highlight or should individually select each (the way it works currently). Taking all curried lambdas would make it feel more like other languages when dealing with functions, which seems nice.

I have no functional programming background, so someone with experience might think differently?

To be clear, in nix technically x = a: b: a + b is two lambda expressions. So if the cursor is in a + b I say take lambda I get:

But if the cursor is in the a: part I say take lambda I get:

This also gets a bit cumbersome if your cursor is in the a + b and say take every arg and expect it to highlight both a and b. Currently it would only select b. I think I'd rather it select all.

I think one possible exception to this should be if one of the lambdas is explicitly wrapped in () which to me implies some sort of explicit separation vs a regular multi-argument lambda?

Maybe this was already discussed/decided for other languages, but I didn't check yet.

Worth noting that atm I also made it so above works as a single "named function" where x is the function name, so if you say take funk you get:

and take funk name:

Not sure what to think of that yet, or how it currently works in other languages. But my thinking is because nix is only lambda functions, meaning named functions won't be used elsewhere, this might be nice to use.

Oct 02 '23 01:10 fidgetingbits

huh interesting. so is currying the only way to define a lambda with multiple parameters? If so, then I think I'd lean towards treating a: b: a+b as lambda, and having "every arg" iterate over a and b. Lmk if you need help getting that to work

@auscompgeek @josharian @AndreasArvidsson curious to hear your thoughts tho

Oct 02 '23 11:10 pokey

I'd be happy to try this out on some real files once the corresponding parse-tree PR is merged.

@auscompgeek fyi nix support merged into parse-tree a little while ago in case you didn't see.

Nov 01 '23 01:11 fidgetingbits

`"item"`

I was really expecting a lot more items here 😅. In particular, for anything that you're considering a "map", any contained key-value pairs should be items

Nov 08 '23 13:11 pokey

Hmm nix really does blur the line between statement and item, doesn't it?

My mind kind of expects the single-line ones not to be statements 😅. But that doesn't feel like a good distinction to draw. Idk cc/ @auscompgeek . Looks like there's no difference between a "statement block" and a "map"?

Nov 08 '23 13:11 pokey

`"call"`

I did not expect this to be a call, especially since there is no "callee". I think a good rule of thumb is that if there's a "call", there should be a "callee"

Nov 08 '23 13:11 pokey

`"callee"`

This callee where the same callee range has two nested domains was surprising. I also didn't expect an "import" to be a "call" / "callee" at all, but if people see it that way I won't object too strongly 😊

Nov 08 '23 13:11 pokey

Also, would be good to make sure you're formatting with @AndreasArvidsson 's formatter; I'm getting a big autoformat when I save one of your files locally

Nov 08 '23 15:11 pokey

"item"

I was really expecting a lot more items here 😅. In particular, for anything that you're considering a "map", any contained key-value pairs should be items

This is fixed now for attrsets.

Nov 13 '23 01:11 fidgetingbits

Hmm nix really does blur the line between statement and item, doesn't it?

My mind kind of expects the single-line ones not to be statements 😅. But that doesn't feel like a good distinction to draw. Idk cc/ @auscompgeek . Looks like there's no difference between a "statement block" and a "map"?

Ya. Really there isn't much similarity to what you'd normally think of as a statement in other (non functional) languages at least. Originally I was leaning towards statements because there are so many complex expressions and stuff when setting key/value pairs, but I think using items feels more natural after playing with it for awhile. I've disabled most statement matching for now to test more, and also added better item support.

Nov 13 '23 01:11 fidgetingbits

Also, would be good to make sure you're formatting with @AndreasArvidsson 's formatter; I'm getting a big autoformat when I save one of your files locally

Weird, I do have that installed and I do get auto-formatting in .scm files. I originally thought it might be prettier conflicting or something, but if I disable prettier and then format .scm document, I don't see any other changes. What kind of changes do you see?

Nov 13 '23 02:11 fidgetingbits

Also, would be good to make sure you're formatting with @AndreasArvidsson 's formatter; I'm getting a big autoformat when I save one of your files locally

Weird, I do have that installed and I do get auto-formatting in .scm files. I originally thought it might be prettier conflicting or something, but if I disable prettier and then format .scm document, I don't see any other changes. What kind of changes do you see?

I think it may just be indentation level. You might need to install the editorconfig plugin so that your VSCode will respect our .editorconfig. I wonder if we should add that to recommended extensions cc/ @AndreasArvidsson @auscompgeek

Nov 13 '23 13:11 pokey

Also, I see this is still draft. Did you want me to give it another review or shall I hold off while you continue to play with it?

Nov 13 '23 13:11 pokey

Also, would be good to make sure you're formatting with @AndreasArvidsson 's formatter; I'm getting a big autoformat when I save one of your files locally

Weird, I do have that installed and I do get auto-formatting in .scm files. I originally thought it might be prettier conflicting or something, but if I disable prettier and then format .scm document, I don't see any other changes. What kind of changes do you see?

I think it may just be indentation level. You might need to install the editorconfig plugin so that your VSCode will respect our .editorconfig. I wonder if we should add that to recommended extensions cc/ @AndreasArvidsson @auscompgeek

Thanks, ya that was it it seems. Installed now. I am using your pre-commit hooks as well, so kinda surprised that doesn't automatically fix it either.

Nov 14 '23 01:11 fidgetingbits

Also, I see this is still draft. Did you want me to give it another review or shall I hold off while you continue to play with it?

can hold off for now. I'm gonna tweak some stuff and there are still a few bugs that I didn't fix yet. Am looking forward to this getting in eventually though. Has been a fun experience.

Nov 14 '23 01:11 fidgetingbits

Thanks, ya that was it it seems. Installed now. I am using your pre-commit hooks as well, so kinda surprised that doesn't automatically fix it either.

Yeah scm autoformat doesn't run as a hook yet as it's just implemented in that VSCode extension. We need to turn it into a cli

Nov 14 '23 10:11 pokey

This one was also surprising to me. I guess it makes more sense in a one-line assert?

Things seem a lot nicer with most of these assert issues addressed (not pushed yet). I like the idea of having branch apply only if it's a one-liner, but from searching the repo I don't see how its possible to actually have that as a query predicate? Is there some existing example of where you've already done this or would I have to add a new predicate?

Nov 14 '23 12:11 fidgetingbits

I like the idea of having branch apply only if it's a one-liner, but from searching the repo I don't see how its possible to actually have that as a query predicate? Is there some existing example of where you've already done this or would I have to add a new predicate?

Oh I wasn't suggesting actually changing definition of "branch" depending on whether it's one or multiple lines. I think Python has taught us that making whitespace significant is a deeply problematic notion 😅. I think it would be quite surprising if a scope behaved differently if it was on one or multiple lines. Unless nix devs see them as significantly different constructs? cc/ @auscompgeek

Lmk if you want me to take another look, or to drop into a meet-up to discuss some of this stuff

Nov 15 '23 13:11 pokey

Fair enough. As much as I hate python whitespace, I actually think there are maybe a couple cases in nix where it could sort of make sense to be whitespace dependent to feel "natural", but I don't have time to explain right atm.

I'm new to coding nix so I'd rather hold off implementing weird things I assume maybe make sense and instead just push something that has the basics covered and then wait until more people (hopefully more experienced devs) get to play with it.

You can feel free to take another look now ya.

There are two issues that I think might require new query predicates but aren't show stoppers (maybe... will see how much you groan at how I handled call arguments in apply/select expressions lol), so happy to solve them as follow up. Will explain in more detail in a comment tomorrow.

I do hope to actually make it to a meetup at some point :D

Nov 15 '23 14:11 fidgetingbits

So to elaborate on a few changes and issues:

The statement stuff has changed to only be lines within the first ... part of let ... in ... statements, which feels a lot better.

There are I think 3 things could be done that I'm not sure how without possibly building new query predicates, so I'll just list them here.

The call argument handling nesting is gross, but works. I suspect there could be a predicate introduced that repeats some node match nested up to N deep or something. I'd probably rather do that as a follow up. I suppose worth noting I also had tried to use #any-of to match both an apply and select expression in the same big query, but ran into some odd errors which I couldn't figure out, which may be a bug I guess. #any-of would only fix the query duplication, not the huge gross query.
Probably we just won't use assert branch scope at all. But just touching on the idea of certain scopes only working depending on white space Like inherit n; on a single line could almost be thought of as a statement (maybe I'm too general with the way I think of statement). So for example you often see short assignments that might look like this:

  pkgs = import pkgsLibPath { inherit lib; pkgs = null; };

In this case intuitively I don't think the inherit would be make sense to be its own statement. But then you often also see longer assignments that are laid out like this:

  pkgs = import pkgsLibPath {
    inherit lib;
    pkgs = null;
  };

In the latter I find it feels natural to want to target the inherit line if it were a statement. Of course can just use line as the target, but just clarifying why in my brain it feels sort of natural to think of it differently depending on the white space.

There is a similar case with the with were you can have like maintainers = with maintainers; [ onny ] ++ teams.php.members; and:

{ config, lib, pkgs, ... }:

with lib;

let

  cfg = config.security.sudo;

Anyway, I think you are right that avoiding whitespace-based differences, but I wanted to explain anyway.

I took a stab at adding inside comment support and ran into some issues. There doesn't seem to currently be a way to use a predicate to query the contents of the matched node. If I do something like:

  (
    (comment) @comment.interior
    (#shrink-to-match! @comment.interior "# ?(?<keep>.*)$")
  )

It works-ish, but the problem is that there are two types of comment both which match as (comment) nodes. So by having the above requiring #strink-to-match then I lose the ability to match on the multi-line comments. I tried something naive like:

[
  (
    (comment) @comment.interior
    (#shrink-to-match! @comment.interior "# ?(?<keep>.*)$")
  )
  (
    (comment) @comment.interior
    ;; This would need to be a multi-line match, which I'm not sure shrink-to-match can do
    (#shrink-to-match! @comment.interior "/\*(?<keep>.*)\*/$")
  )
] @comment

But this doesn't work since #shrink-to-match isn't actually a match predicate per se, it's just a modifier on what was already matched, so if you try to match on the multiline variant, since (comment) still matches in the first case, #shrink-to-match will error because it doesn't find the #.

So this made me think we probably want something like a #starts-with or even just a #regex predicate where we could just have regex("^#") for single line and #regex("^/\*") for double line, and then have the #strink-to-match after.

I think that's it. Definitely understand why you like lots of small PRs, as it gets hard to track all these different issues (even for me actively working on it, let alone you lol).

Nov 16 '23 01:11 fidgetingbits

So to elaborate on a few changes and issues:

The statement stuff has changed to only be lines within the first ... part of let ... in ... statements, which feels a lot better.

There are I think 3 things could be done that I'm not sure how without possibly building new query predicates, so I'll just list them here.
1. The call argument handling nesting is gross, but works. I suspect there could be a predicate introduced that repeats some node match nested up to N deep or something. I'd probably rather do that as a follow up. I suppose worth noting I also had tried to use `#any-of` to match both an apply and select expression in the same big query, but ran into some odd errors which I couldn't figure out, which may be a bug I guess. `#any-of` would only fix the query duplication, not the huge gross query.

Here is a proposal for a new query predicate operator: https://github.com/fidgetingbits/cursorless/pull/1. I think it's actually fairly general purpose

2. Probably we just won't use `assert` branch scope at all. But just touching on the idea of certain scopes only working depending on white space Like `inherit n;` on a single line could almost be thought of as a statement (maybe I'm too general with the way I think of statement). So for example you often see short assignments that might look like this:
  pkgs = import pkgsLibPath { inherit lib; pkgs = null; };
In this case intuitively I don't think the inherit would be make sense to be its own statement. But then you often also see longer assignments that are laid out like this:
  pkgs = import pkgsLibPath {
    inherit lib;
    pkgs = null;
  };
In the latter I find it feels natural to want to target the inherit line if it were a statement. Of course can just use line as the target, but just clarifying why in my brain it feels sort of natural to think of it differently depending on the white space.

There is a similar case with the with were you can have like maintainers = with maintainers; [ onny ] ++ teams.php.members; and:
{ config, lib, pkgs, ... }:

with lib;

let

  cfg = config.security.sudo;
Anyway, I think you are right that avoiding whitespace-based differences, but I wanted to explain anyway.

I would be tempted to just err on the side of more scopes. So if it makes sense multi-line, let's include it always. Can always "grand state" your way out of it or target part of the scope not trapped

3. I took a stab at adding `inside comment` support and ran into some issues. There doesn't seem to currently be a way to use a predicate to query the contents of the matched node. If I do something like:
  (
    (comment) @comment.interior
    (#shrink-to-match! @comment.interior "# ?(?<keep>.*)$")
  )
It works-ish, but the problem is that there are two types of comment both which match as (comment) nodes. So by having the above requiring #strink-to-match then I lose the ability to match on the multi-line comments. I tried something naive like:
[
  (
    (comment) @comment.interior
    (#shrink-to-match! @comment.interior "# ?(?<keep>.*)$")
  )
  (
    (comment) @comment.interior
    ;; This would need to be a multi-line match, which I'm not sure shrink-to-match can do
    (#shrink-to-match! @comment.interior "/\*(?<keep>.*)\*/$")
  )
] @comment
But this doesn't work since #shrink-to-match isn't actually a match predicate per se, it's just a modifier on what was already matched, so if you try to match on the multiline variant, since (comment) still matches in the first case, #shrink-to-match will error because it doesn't find the #.

So this made me think we probably want something like a #starts-with or even just a #regex predicate where we could just have regex("^#") for single line and #regex("^/\*") for double line, and then have the #strink-to-match after.

I would just include the # . Does that make your life easier?

I think that's it. Definitely understand why you like lots of small PRs, as it gets hard to track all these different issues (even for me actively working on it, let alone you lol).

Ha yeah 😅

Also, looks like this PR has some of the same leading / trailing / removal issues I fixed in #1962, so would be awesome if you could take a crack at fixing them. Basically:

Leading and trailing is designed for delimiters. We only remove one of them; not both. Think of a comma-separated list, where deleting two commas would smash the siblings together. If you need to remove something before and after, just make a removal range
You no longer need to specify @leading.end; it's implied to be the scope itself
Instead of @leading.start.endOf, you can now just say @leading.endOf; start is now implied
See my tweaks to #1962 if you're not sure what I mean; hopefully that will make things clear

Feb 02 '24 15:02 pokey

Here is a proposal for a new query predicate operator: fidgetingbits#1. I think it's actually fairly general purpose

We discussed this one in a meet-up and decided it was simpler to just introduce a query predicate operator called #recursively-expand-to-parent-of-type! where you give it a node and a type and it recursively expands the range to direct parents of the given type. So in this case it woudl be (#recursively-expand-to-parent-of-type! @functionCallee.domain "apply_expression"). Make sense @fidgetingbits? Lmk if you want help implementing that

Feb 06 '24 17:02 pokey

Sounds good. I do really need to get more familar with typescript, so will try to give it a whirl when I find the time.

Feb 12 '24 03:02 fidgetingbits

As proposed above, I don't think we'll end up going the route of https://github.com/fidgetingbits/cursorless/pull/1, but I captured it in a tag just in case we want it in the future, and the PR disappears from @fidgetingbits's repo for whatever reason (not sure how things work with forks). Here's the commit https://github.com/cursorless-dev/cursorless/commit/52ce5b123b0c5bd66f2860218b602244adfb05bb

Apr 22 '24 14:04 pokey

@fidgetingbits I've tried to read through the history of this one, but I think it's better if I just ask you. Is this one ready? Are you happy with the current state?

Jan 14 '25 15:01 AndreasArvidsson

@fidgetingbits I've tried to read through the history of this one, but I think it's better if I just ask you. Is this one ready? Are you happy with the current state?

No it's got some bugs and some decisions I made early on I think need to change now that I'm much more familiar with the language. I will come back to it eventually but I haven't been able to actually work on it in months, although I do use it every day.

I guess one question for you is, given what a mess this PR became, would it be better/easier to just close it and make smaller PRs over time?

Jan 14 '25 16:01 fidgetingbits

Okay I will convert it to draft in the meantime.

You can decide if you want to continue to work on this pull request or close it and make multiple smaller ones. I unfortunately don't have the time to review languages I don't know in the kind of detail that Pokey did. I will more or less entrust the behavior to you. ie when you say that it's ready and you are happy with it we will merged unless I find something I definitely don't like. Sounds good?

Jan 15 '25 06:01 AndreasArvidsson

cursorless cursorless copied to clipboard

initial nix language support

Checklist

Checklist

"item"

"call"

"callee"

"item"

cursorless
cursorless copied to clipboard

`"item"`

`"call"`

`"callee"`

`"item"`