elvish icon indicating copy to clipboard operation
elvish copied to clipboard

Document exploding of string values with $@str

Open hanche opened this issue 3 years ago • 5 comments

I learned in chat today that $@str explodes a string into its individual characters. This should be documented in the Variable use section of the language specification, along with the explosion of list values.

hanche avatar Feb 15 '21 21:02 hanche

Playing Devil's Advocate I'm inclined to suggest that this not only not be documented it should also be an error for consistency with other non-list use cases:

> m = [&a=1 &b=2]
> put $@m
Exception: cannot iterate map
[tty 2], line 1: put $@m
> s = "ab"
> put $@s
▶ a
▶ b

Why should strings and maps be treated differently? If we're going to allow using $@ to explode a string then it should be allowed for a map (presumably by outputting the keys of the map). If the user wants to treat a string as a list they should probably be required to explicitly construct the list of chars using str:split '' $string. Of course, you can also argue that strings are essentially lists of characters and/or bytes and therefore the $@s syntax makes sense as shorthand for str:split '' $s.

krader1961 avatar Feb 16 '21 04:02 krader1961

Well, that is certainly an interesting point of view. I could point out, then, that the same applies to the builtin function all:

⬥ all [x y]
⮕ x
⮕ y
⬥ all xy
⮕ x
⮕ y
⬥ all [&x=1 &y=2]
Exception: map cannot be iterated
[tty 32], line 1: all [&x=1 &y=2]

The same goes for for syntax:

⬥ for t [x y] { put $t }
⮕ x
⮕ y
⬥ for t xy { put $t }
⮕ x
⮕ y
⬥ for t [&x=1 &y=2] { put $t }
Exception: cannot iterate map
[tty 37], line 1: for t [&x=1 &y=2] { put $t }

This behaviour is also undocumented.

Now, about consistency. Granted, if “non-list” is a useful category, then the current behaviour seems inconsistent. However, I think “iterable” is the correct category, in which case the behaviour of $@, all, and for is perfectly consistent: They work on iterables and not on non-iterables.

So then the question is, what objects should be iterable? I would argue that strings are a natural candidate, since a string really is (or rather, may be considered as) a list of characters.

So what about maps? Certainly, a case can be made for maps also to be iterable. However, that raises the obvious question: What should be the result of exploding or iterating a map? One possibility is the keys, as suggested. But it can also be argued that [key value] pairs is more useful, as it saves the consumer from looking up the values.

Further thoughts: Lists and strings have one more thing in common, namely slice syntax for getting sublists or substrings. It really seems to make good sense to treat the two alike, in every possible way.

It all seems quite consistent to me.

It appears to me that there are two ways forward:

  1. Keep the current behaviour of $@, all, and for when applied to strings (and optionally, extend it to maps), and document it, or
  2. Deprecate and later remove these behaviours.

Keeping the current behaviour without documenting it is not tenable. I hope we can at least agree on that.

hanche avatar Feb 16 '21 08:02 hanche

@krader1961 I don't understand your objection. Strings in Elvish have always been treated as lists, and this is documented. E.g.:

[~]|> a = foobar
[~]|> all $a
▶ f
▶ o
▶ o
▶ b
▶ a
▶ r
[~]|> put $a[2]
▶ o
[~]|> put $a[2..5]
▶ oba

Not sure why you think the standard $@list syntax for exploding it should work differently.

zzamboni avatar Feb 16 '21 09:02 zzamboni

My previous comment was somewhat tongue in cheek. Hence the "devil's advocate" preface. My point is that, as @hanche points out, the $@ syntax should probably be defined in terms of iterable objects (or "containers" as mentioned elsewhere in the documentation) and not an explicit "list". A map is an iterable container and therefore should be a valid object for $@var. Note that "list" has a well defined meaning in the documentation.

@zzamboni, The documentation you linked to talks about the equivalence of indexing a string behaving as if the index were applied to an equivalent list of UTF-8 bytes. Is there anywhere else it is documented that strings are conceptually equivalent to lists? Too, the all documentation only talks about and has explicit list examples. So it's not clear how your example using all supports your argument.

Note that I am not actually arguing the current behavior of strings in contexts such as $@ and all be changed. I am simply wondering if the current behavior is inconsistent and perhaps the magic $@ syntax should be dropped in favor of explicit container explosion via (all ...) and similar mechanisms (e.g., str:split '' $string).

krader1961 avatar Feb 17 '21 04:02 krader1961

@krader1961 You're doing great at being devil's advocate! At the opposite end of the available options, I've been thinking a bit about maps and how to iterate them. I was wondering above whether iterating a map should yield just the keys, or key-value pairs. So I had an idea: Why not both? So here we go, then:

Assuming $map holds a map, we could let $@map expand to the keys, and (wait for it) $@@map expand to key-value pairs.

What about all? Well, since we already have keys to extract the keys, we could let all $map produce key-value maps. I know, that makes for a bit of inconsistency, since ideally, (all $map) should be equivalent to $@map. But, I am brainstorming here. Maybe someone can think of a better way.

And finally, we could extend the for syntax, allowing both for key $map { … } and for key value $map { … }. The latter, being syntactically different, can't be confused with the single-variable version of for.

A totally different option would be to let iteration of a map always return key-value pairs, and then using destructuring, we could write the above examples as for [key _] $map { … } and for [key value] $map { … }. This allows for greater consistency, perhaps. Then $@map would return key-value pairs, as would all $map. Come to think of it, then we could get the keys by $@map[0].

PS. I have long thought the combination of $@ and indexing to be confusing, but that is a different discussion.

PPS. Getting back to your advocatus diaboli position, we should be careful about introducing too many syntax tricks. There is little danger that elvish will become another APL, but we should indeed be careful about moving too far in that direction. OTOH, syntactic sugar can work wonders for the interactive features, with the attendant risk of rendering bigger programs unreadable.

hanche avatar Feb 17 '21 08:02 hanche