fish-shell icon indicating copy to clipboard operation
fish-shell copied to clipboard

Should `string join \n` disable newline-splitting on the results?

Open lilyball opened this issue 6 years ago • 22 comments

The command string join \n is rather pointless because the output of this command undergoes newline-splitting again. We should consider disabling newline-splitting for string join in general, as the whole point of this command is to join all arguments into a single string.

lilyball avatar Jun 16 '19 04:06 lilyball

Hmm, I use string join \n as a quick way to echo everything in a var with newlines. Often. echo itself doesn't really work, as it'll join everything with spaces.

Darkshadow2 avatar Jun 16 '19 09:06 Darkshadow2

printf '%s\n' $args would do the same thing. And what I'm proposing doesn't affect your use-case anyway, it only affects what happens when the command is used inside a command substitution (i.e. it would still output the same actual text, I just want to disable the implicit newline-splitting that occurs with command substitutions).

Edit: Well, printf '%s\n' $args actually includes a trailing newline that string join \n $args doesn't, but that may be a good thing anyway.

lilyball avatar Jun 16 '19 22:06 lilyball

Ah, you didn't mention command substitutions in your first post.

And yeah, I know about printf '%s\n' $args, but that trips something in my head that says "No! Can't do that, there's not enough format specifiers for the variables!" I laugh when I think it, but that's why I tend to use string join \n instead.

Darkshadow2 avatar Jun 17 '19 09:06 Darkshadow2

We should consider disabling newline-splitting for string join in general

So, there are two cases where newline-splitting can happen with string join:

  • When the joiner contains a newline, even if reading from stdin

  • When it's reading from argv and the argument contains a newline

When it's reading from stdin and the joiner doesn't contain a newline, it by definition only generates one line.

We should, in general, be careful with piling more special cases onto the splitting behavior (currently it's string split and string split0 at the end of a pipeline are special, IIRC - we should document this better), but adding string join to that is probably alright because these are very weird edge-cases.

IMHO only making one of those special would be the worst possible solution.

faho avatar Jun 17 '19 15:06 faho

FWIW, I see string split0 being magical:

$ echo [(printf 'a\na\0b' | string split0)]
[a
a] [b]

but I don't see it in string split:

$ echo [(printf 'a\na,b' | string split ,)]
[a] [a] [b]

(not arguing for or against, just to be precise about current behavior)

cben avatar Jul 07 '19 12:07 cben

@cben: string split will act on the input per-line. So in your example it reads the first line, doesn't find a splitter, prints it, reads the next line, finds a splitter, prints the part before, prints the part after, done.

Compare:

echo [(string split , a\na,b)]

faho avatar Jul 07 '19 12:07 faho

I seriously trying to understand the use-case for this. I mean, let me get 2 examples (I'll remove part of the output)

~ $ string join \n (string split  $PATH)
/home/[USERNAME]/.asdf/bin
/home/[USERNAME]/.node_modules/bin
/usr/local/sbin
/usr/local/bin
/usr/bin

And if I want it single-line

~ $ echo (string join \n (string split  $PATH))
/home/[USERNAME]/.asdf/bin /home/[USERNAME]/.node_modules/bin /usr/local/sbin /usr/local/bin /usr/bin

Now consider how @Darkshadow2 uses it:

Hmm, I use string join \n as a quick way to echo everything in a var with newlines. Often. echo itself doesn't really work, as it'll join everything with spaces.

Unless there is a serious reason for changing string join \n, this should be considered a breaking change, so IF it is to be done, should happen only on a major release, and my 5 cents are: If it's not broken, don't try to fix it.

Even considering @lilyball argument of:

printf '%s\n' $args would do the same thing. And what I'm proposing doesn't affect your use-case anyway, it only affects what happens when the command is used inside a command substitution (i.e. it would still output the same actual text, I just want to disable the implicit newline-splitting that occurs with command substitutions).

It's WAY easier to include string join "" (string split $SOMETHING) in the docs.There's a pretty high probability of at least one dude actually that uses this for something that will break, even if we have no idea of who or what or how this is relevant. Someone may call this dude wrong, dumb, genius, idiot, innovative, clever-ass-hacker or whatever, this will still be a breaking change, and there's a reason we take breaking changes seriously.

This quote probably is a lot better to convey the feeling:

The fundamental problem is that the current development model contains no well-defined points where breakages of the kernel-related userspace were allowed and expected by users.

The basic rule should be "never". For example, we now have three different generations of the "stat()" system call, and yes, we wrote the code to maintain all three interfaces. Breaking an old one for a new better one simply wasn't an option. - Linus Torvalds

Source: gcc and kernel stability (Linus Torvalds)

Thing is, people use this stuff, even in the ways nobody could guess. Obviously it's waaaaaay more acceptable to break stuff in a shell (that isn't even the default in most distros) than breaking the Kernel user space, but a breaking change is still a breaking change, so it should, at the very least, be considered a breaking change, discussed as a breaking change, and released as a breaking change....

frnco avatar Aug 21 '19 04:08 frnco

And if I want it single-line

~ $ echo (string join \n (string split  $PATH))
/home/[USERNAME]/.asdf/bin /home/[USERNAME]/.node_modules/bin /usr/local/sbin /usr/local/bin /usr/bin

This right here is the perfect example of why we want to do this. This line is very obviously broken. You asked to join with \n but it didn't.

Unless there is a serious reason for changing string join \n, this should be considered a breaking change

It is technically a breaking change, but again, your motivating example here is obviously bad behavior and making echo (string join \n foo) work as expected (join with newlines) would be fixing a bug.

The behavior we're proposing to change is behavior that today is 100% useless. cmd (string join \n $foo) is identical to cmd $foo. The only benefit of string join \n $foo is when you're piping it to something else, and we're not proposing to change the behavior of piping, just string splitting when used to terminate a substitution. As the current behavior there is useless, there shouldn't be any problem with changing it.

It's WAY easier to include string join "" (string split $SOMETHING) in the docs.

Easier than what? The command you just listed squishes all of the arguments into a single string with no separator. It's not equivalent to anything being discussed.

There's a pretty high probability of at least one dude actually that uses this for something that will break

Nobody can possibly be using this for anything. People can be using this, but it doesn't actually do anything today, so whatever they're trying to use it for is already broken.

Source: gcc and kernel stability (Linus Torvalds)

kernel stability rules should not be used as the template for literally anything except the kernel. Linux is insanely conservative and its stability rules are not appropriate for anything else.


If we say "we can't do this until Fish 4.0" then we should just close this issue now. Fish has major changes approximately once per millennium.

As it stands, cmd (string join \n $foo) is a 100% useless call and does not do what it looks like it should. The current behavior leads to confusion (e.g. trying to run echo (string join \n $foo) and wondering why it didn't work) and is just generally useless.

lilyball avatar Aug 21 '19 04:08 lilyball

Pushing this out to the next release as we don't appear to have rough consensus yet.

zanchey avatar Sep 25 '19 00:09 zanchey

This right here is the perfect example of why we want to do this. This line is very obviously broken. You asked to join with \n but it didn't.

It is technically a breaking change, but again, your motivating example here is obviously bad behavior and making echo (string join \n foo) work as expected (join with newlines) would be fixing a bug.

I completely agree with all of those.

Easier than what? The command you just listed squishes all of the arguments into a single string with no separator. It's not equivalent to anything being discussed. Inser a space or whatever between the "", we can all be civilized righr?

Now for these I only partially agree:

kernel stability rules should not be used as the template for literally anything except the kernel. Linux is insanely conservative and its stability rules are not appropriate for anything else.

Yeah they're going overboard. Doesn't mean wh should just toss everything in the trash, a lot of things there may be enlightening, as with everything, I joust pointed, although as a Programmer I agree with the idea that it's just plain wrong, I mean I use fucking Arch, there's nothing that tells you to deal with stuff even nearly as much as Arch does.

Nobody can possibly be using this for anything. People can be using this, but it doesn't actually do anything today, so whatever they're trying to use it for is already broken.

I do also believe nobody uses, but I deal with factes and statistics, not guessing, and I don't have data to back this, so I considered the worst-case scenary, even though I actually agree, kinda like work ethics I guess...?

It is technically a breaking change, but again, your motivating example here is obviously bad behavior and making echo (string join \n foo) work as expected (join with newlines) would be fixing a bug.

COMPLETELY AGREE with this, I just think this should either be versioned as a breaking change, so in the minuscule chance someone actually uses this, they know when they see the version changed. Most people don't care about versions 'cause it's not a library, but if you use stuff in production it's kinda nice to be able to have some hint something could be wrong when updating, but without impacting regular users.

If we say "we can't do this until Fish 4.0" then we should just close this issue now. Fish has major changes approximately once per millennium.

Now for this one, why not just freaking change it to 4.0, and including in 4.0 a warning that Versioning is now 1.2.3.4-5 (Or whateber) the segments being

  1. Changes that impact a lot of people
  2. Changes that may break some stuff that is not widely used.
  3. ...

And whatever for the rest... I mean, at least people would know what to expect. Fish is not a Library, but it's a shell, and I still have a shitload of bash scripts that I'm converting to Fish a bit at a time.If someone by a miracle was stupid enough to do this on a script that deals with anything more relevant, i.e. stuff like banking data, then yes, this could create a lot of problems. Likely? Nope. But still, the Challenger did explode because it was launched from temperatures below freezing, despite the engineers being brushed aside with the seemingly acceptable reason that the Challenger was designed to go to space where the temperature is below freezing.

Not that I thing the things are comparable, but as I say, to me it feels wrong, kinda like doing something unethical. Even if nobody would be hurt, I would feel pretty bad doing something like this, and thought you guys should consider this, so that everyone'll be prepared in case something happens to someone.

Though empathy is so close to disappearing that maybe nobody still cares for other people, ethics and stuff like that. Still, I wouldn't do it, and I thought I should point my reasons in the off chance someone might find it relevant.

But yeah, it's the devs choice, and ultimately you gotta go for one of the options, even if it's just "less undesirable".

frnco avatar Oct 07 '19 21:10 frnco

Now for this one, why not just freaking change it to 4.0, and including in 4.0 a warning that Versioning is now 1.2.3.4-5 (Or whateber) the segments being

We could also just say "Fish doesn't use Semantic Versioning, fish version 3.(x+1) may include minor breaking changes from fish version 3.x"

lilyball avatar Oct 08 '19 04:10 lilyball

Now for this one, why not just freaking change it to 4.0, and including in 4.0 a warning that Versioning is now 1.2.3.4-5 (Or whateber) the segments being

We could also just say "Fish doesn't use Semantic Versioning, fish version 3.(x+1) may include minor breaking changes from fish version 3.x"

I'm up for that. Simply warning people that they must not count on every API staying the same (At least for advanced usage) should already make people know what to expect, and perhaps even lock their versions if they do something ftoo critical with Fish.

frnco avatar Oct 08 '19 14:10 frnco

This seems too risky at this point for 3.2.0, pushing to 3.3.0

ridiculousfish avatar Jan 12 '21 01:01 ridiculousfish

This seems too risky at this point for 3.2.0, pushing to 3.3.0

I'm still skeptical of changing behavior of stuff that works and is used by others, but if this is actually necessary I'd suggest to at least include in the docs some way to get the old behavior after the change, so anyone can fix any scripts or whatever that rely on that.

frnco avatar Mar 08 '21 21:03 frnco

By definition scripts cannot rely on the old behavior, they can only accidentally invoke the old behavior and not notice because of a bug. Any usage of string join \n as the last command in a command substitution is useless, so the fix there is to just stop doing that instead of toggling a flag to preserve old behavior.

lilyball avatar Mar 08 '21 22:03 lilyball

By definition scripts cannot rely on the old behavior, they can only accidentally invoke the old behavior and not notice because of a bug. Any usage of string join \n as the last command in a command substitution is useless, so the fix there is to just stop doing that instead of toggling a flag to preserve old behavior.

And if that accident happened and the result worked for some people and they left it running like that and it breaks, it'd be nice to have at least a line on the changelog explaining what changed and how to get the previous result, wouldn't it? Maybe I'm the weird one, just checking.

frnco avatar Mar 09 '21 19:03 frnco

it'd be nice to have at least a line on the changelog explaining what changed and how to get the previous result, wouldn't it?

Obviously changes will be mentioned in the changelog, yes.

That's not something you need to ask for specifically, and doing so comes across as a bit condescending.

Yes, we would add this to the changelog and yes, we would see to it that it breaks as little as possible (basically nothing because any current use is bogus to begin with).

faho avatar Mar 09 '21 19:03 faho

it'd be nice to have at least a line on the changelog explaining what changed and how to get the previous result, wouldn't it?

Obviously changes will be mentioned in the changelog, yes.

That's not something you need to ask for specifically, and doing so comes across as a bit condescending.

Yes, we would add this to the changelog and yes, we would see to it that it breaks as little as possible (basically nothing because any current use is bogus to begin with).

That pretty much covers my worries, breaking changes will bother me forever but that doesn't mean I'm right, and I think in this case it may actually be the best option. Though, I still think there are a lot of way more relevant issues, but the one who should decide on the priorities is the one doing the coding so I can just work on whatever I deem more relevant.

frnco avatar Mar 10 '21 17:03 frnco

So, what's the current best way to join an array on newlines and disable newline splitting? This seems to do it, but I wanted to be sure:

set multiline_string (string join \n $arr | string collect)

cowboy avatar May 12 '21 17:05 cowboy

That's it, yes.

faho avatar May 12 '21 17:05 faho

This issue should be closed, as string split0 already does what the proposed string join \n does.

What needs to be done is to improve the documentation of string split and string split0, otherwise it's quite easy to get confused like @cben, in which case ad-hoc work-arounds are not satisfactory enough:

FWIW, I see string split0 being magical:

$ echo [(printf 'a\na\0b' | string split0)]
[a
a] [b]

but I don't see it in string split:

$ echo [(printf 'a\na,b' | string split ,)]
[a] [a] [b]

(not arguing for or against, just to be precise about current behavior)

IMO what needs to be documented is the typing rules of fish-shell:

  1. string split and string split0 are not string processing commands but type casting operations (string -> array) for fish-shell, at least it's the more logically consistent viewpoint
  2. command substitution ( ( ... ) ) logically has an implicit trailing string->array casting by string split \n, unless the final command is string split*
  3. when used alone or followed by a pipeline ( | ), the array produced by string split* is joined implicitly with \n into a string
  4. string split assumes the argument from stdin to be effectively followed by string split \n

From the typing viewpoint, it's easy to find a more principled solution for the problem: echo [(string split , (printf 'a\na,b' | string split0))], in which printf ... can be replaced by any command.

If possible, a better solution would be to remove (2), (3), and (4) altogether. Failing that, it's helpful to remember as a general set of rules that, when encountering problems, command substitution should always end with | string split*, and string split0 should only appear after |, and string split should not appear directly on either side of |.

champignoom avatar Sep 25 '21 17:09 champignoom

Why the off-topic mark? Is this feature already in the to-do list?

It would not be a decent feature if one has to investigate which one among string join \n, string join $newline, string join ,\n is lucky enough to get special care from the developers.

champignoom avatar Sep 26 '21 17:09 champignoom