nixvim icon indicating copy to clipboard operation
nixvim copied to clipboard

How to merge unkeyed attrs?

Open jfly opened this issue 11 months ago • 4 comments

I'm curious how to merge multiple listToUnkeyedAttrs. For example:

./ $ nix repl github:nix-community/nixvim
...
nix-repl> builtins.trace (lib.nixvim.toLuaObject ((lib.nixvim.listToUnkeyedAttrs ["a" "b"]) // (lib.nixvim.listToUnkeyedAttrs ["c" "d"]) // {foo = "bar";})) null
trace: { "c", "d", foo = "bar" }
null

Ideally, I would want the resulting lua table to be:

{ "a", "b", "c", "d", foo = "bar" }

It feels like this is a fundamental issue with how listToUnkeyedAttrs represents unkeyed attrs:

./ $ nix repl github:nix-community/nixvim
nix-repl> lib.nixvim.listToUnkeyedAttrs ["a" "b"]
{
  __unkeyed-0 = "a";
  __unkeyed-1 = "b";
}

Have we considered one of the following?

  • Reworking lib.nixvim.listToUnkeyedAttrs to represent unkeyed attrs in a different way? For example, rather than injecting integer indices into the keys, we could inject the value.
  • Create a custom merge function that knows to treat these --unkeyed-<N> attrs specially.

jfly avatar Apr 24 '25 06:04 jfly

Answer

Yes, this is a limitation of the current impl.

Currently, unkeyed attrs are merged in the same way as any other attr-set, so __unkeyed-0 from one definition will try to merge with __unkeyed-0 from another definition. This means the only way to merge them is to manually specify different indexes.

Note that the indexes don't necessarily have to be sequential, you could have one definition using indexes 1-10, while another definition uses indexes 100-110, for example.

State of nixvim

In the vast majority of nixvim options, lua is typed as anything or attrsOf anything. This means we don't have an obvious way to implement custom merge logic.

The two ways we could go about it are:

  1. replace usage of types.anything treewide
  2. replace types.anything with a custom impl in nixvim's "extended lib"

For some additional context, there's a parallel effort for serialising lua in nixpkgs. Nixvim uses lib.nixvim.toLuaObject while nixpkgs has pkgs.formats.lua and lib.generators.toLua.

lua serialisation in nixvim vs nixpkgs

Eventually, I'd like to upstream nixvim-specific serialisation features into nixpkgs, however currently some things are done differently:

  1. nixpkgs uses lib.mkLuaInline vs nixvim uses lib.nixvim.mkRaw
  2. nixpkgs has no support for mixed list/attrs lua tables vs nixvim has lib.nixvim.listToUnkeyedAttrs
  3. nixvim tries to avoid using ["string"] table-key syntax for valid lua identifiers

I've been brainstorming about a replacement for lib.nixvim.listToUnkeyedAttrs that would be suitable for upstreaming to nixpkgs.

  • Ideally, this would include an option type that can merge attr and list definitions into the combined table.
  • Ideally, the "mixed" data-type would be useful more generally than just in lua; for example it might also be used for PHP's associative arrays.

I have a nixpkgs/table branch where I was playing with this. Feedback or ideas welcome.

MattSturgeon avatar Apr 24 '25 11:04 MattSturgeon

Ideally, the "mixed" data-type would be useful more generally than just in lua; for example it might also be used for PHP's associative arrays.

Ha! So, full disclosure, I'm not actually trying to merge unkeyed attrs in nixvim. I actually was researching how y'all handled that problem, because I've got a similar problem trying to add libconfuse support to nixpkgs. Having some sort of "mixed" data-type that could represent libconfuse's config format would be awesome. Perhaps a collaboration could work out here?

jfly avatar Apr 24 '25 21:04 jfly

I actually was researching how y'all handled that problem, because I've got a similar problem trying to add libconfuse support to nixpkgs.

Nice. You can definitely learn from our mistakes 😁


Prior art for a mixed data type includes:

  • nixvim's "unkeyed attrs"
  • pkgs.formats.php's associative arrays
  • my draft tables branch

I believe the best way to go about this is to first settle on a structure to represent the mixed data, then create some util functions for working with that structure, and finally implement an option-type for it.

Other steps would include adding support to the relevant lib.generators functions and documenting it.

My draft branch has done steps 1 & 2, and has a sketched out option-type for step 3 (I believe the merge function may not be handling all edge cases yet; tests needed). Maybe I'll start a PR for steps 1&2 and postpone the option-type for a dedicated PR?


Regarding the data structure itself: the first question is does it need a _type tag, or should it be possible to infer the data is mixed from the presence of (e.g.) a _list attr?

The second question is should the named elements be at the top-level, or nested in a separate attrs attribute?

The third question is should the unnamed elements be in a list attribute, or be assigned "special" names (like __unnamed-1).


Personally I believe nixvim's approach of using special __unkeyed-123 attr names is awkward. I think it'd be better to have a special list /_list/_unnamed/etc attribute that contains an actual list of the unnamed elements.

I'm unsure whether it's better to have the named elements as top-level attrs or nest them under a attrs/_attrs/named/_named attribute... On the one hand, having top-level attr names is probably easier for end-users to work with, on the other hand it could lead to name conflicts with the unnamed element's _list attribute. 🤔


Perhaps a collaboration could work out here?

Sounds good 👍

MattSturgeon avatar Apr 24 '25 22:04 MattSturgeon

I believe the best way to go about this is to first settle on a structure to represent the mixed data, then create some util functions for working with that structure, and finally implement an option-type for it. [...] Maybe I'll start a PR for steps 1&2 and postpone the option-type for a dedicated PR?

That sounds reasonable to me!

The second question is should the named elements be at the top-level, or nested in a separate attrs attribute?

One thing to consider is if it's important to be able to preserve insertion order across mixed values. I believe lua intentionally does not preserve this information, but if we're trying to build a general mixed data type, we might need to? In other words, I believe these two lua tables are identical:

local t1 = { "first", second = 2, "third" }
local t2 = { "first", "third", second = 2 }

But I believe these 2 PHP arrays are "different" (PHP seems to remember insertion order):

php > print_r(array("first", "second" => 2, "third"));
Array
(
    [0] => first
    [second] => 2
    [1] => third
)
php > print_r(array("first", "third", "second" => 2));
Array
(
    [0] => first
    [1] => third
    [second] => 2
)

While it might be weird for a configuration file to have semantics that depend on this order, I think ideally we would build something that can represent this. If we take that as a design constraint, then I think the most natural representation is a list of Union[KeyValuePair, UnkeyedValue]? Perhaps at the expense of making lookups expensive.

jfly avatar Apr 24 '25 22:04 jfly