elvish
elvish copied to clipboard
Separate byte pipe and value pipe
Right now, each pipe contains both a byte pipe component and a value pipe component. In echo foo | cat
, the byte pipe is used; in put bar | each put
, the value pipe is used. While in { echo foo; put bar } | { cat; each put }
, the byte pipe carries the bytes "foo\n"
, while the value pipe carries a string value "bar"
.
We might want to separate these two pipes by e.g. using |
for byte pipes exclusively and !
for value pipes. Then we will write echo foo | cat
and put bar ! each put
. In { echo foo; put bar } | { cat; each put }
, the put
and each
do nothing. In { echo foo; put bar } ! { cat; each put }
, the echo
and cat
do nothing.
Instead of doing nothing when the byte is incompatible with the command, we might want to do some sort of implicit conversion.
In light of the changes made in the past 3+ years, and the experience gained by people using elvish, I'm a thumbs-down on this proposal for a few reasons. First and foremost is the ambiguity it introduces. What happens when the output of one stage of a pipeline contains bytes and values? Do we really want, or need, to support something like this:
{ echo foo; echo bar; put foo; put bar } | { cat } ! { each put }
Why should someone have to write that rather than just | { cat; each put }
? But, more importantly, how would such a syntax work when there is more than one data producer?
It seems to me the more fundamental question is whether a pipe should support mixing bytes and values. At the time I write this it appears that, in practice, mixing the two types of streams in a pipe is rare and doesn't cause many problems when it occurs. Regardless, it might be worth considering having the elvish runtime disallow mixing bytes and values in a pipeline unless the user explicitly says that is okay. Obviously this would require a panic when mixed types in a pipe are detected and the user did not use the syntax to allow it. Which itself is problematic. Also, it could result in some users always using the syntax that allows mixing bytes and values in a pipe. Thus defeating the whole point of the safety check. Nonetheless, I think requiring the user to specify that a pipe supports both bytes and values is a better way to introduce some run-time safety to guard against an "impedance mismatch" than introducing syntax for bytes only or values only pipes.
If commands leave in the pipe the values they don't use (see #923 for an example), then this issue could be solved using the existing only-bytes
and only-values
builtins, e.g.:
{ echo foo; echo bar; put foo; put bar } | { only-bytes | each $echo~ ; only-values | each $put~ }
(this doesn't work at the moment, since only the first command captures the pipeline)
@krader1961 I agree that if anything should be implemented in this direction, it should be forbidding the mixture of value and byte pipes. I agree with the observation that in practice it's very rare for a command to output both bytes and values.
Instead of raising an exception at runtime (I assume that's what you mean by "panic"), Elvish could forbid the mixture of value and byte pipes statically, by requiring each function to either only write to the value pipe, or only write to the bytes pipe. Essentially a static type system for functions.
The restriction being static, it will disallow some functions that behave well at runtime, such as this:
fn f [b]{
if $b {
put value
} else {
echo bytes
}
}
Function f
never mixes value output and bytes output, but since the restriction needs to be static, it will now be rejected.
I don't think such functions are common in practice, but I may be wrong.
In any case, it seems worth trying building this static typing system and see how well it works.
The problem, as I see it, with trying to do this via a static check is how to handle external commands. Consider an elvish function that runs an external command solely for its side effects. It is not expected that the external command will produce any output. The elvish function only writes to the value stream. Does that combination trigger a static check exception? If yes it implies that a function that writes to the value stream must capture, or discard, the output of any external command. Even if that command is not expected to produce any output. Too, there are no doubt other scenarios which introduce ambiguity. It's less ambitious, but probably more practical, to simply monitor the data flowing through each pipe and raising an exception (or otherwise reporting a problem) if both values and bytes are pushed into the pipe.
My very tiny experience with Elvish kind of suggests that I'll probably wrap quite some non-Elvish binaries and with that it's almost guaranteed I'll mix both pipe kinds in one command. And I think there are also other legitimate use cases.
Another option might be to:
- make all Elvish commands produce only through a "value pipe"
- make all Elvish commands consume only through a "value pipe"
- make the ordinary pipe
|
to always auto-convert a "byte pipe" data to some low-level "value pipe" data (low-level to avoid as much overhead & copying as possible) if on the right side of the pipe is a non-Elvish binary - introduce a simple command with broad conversion variability of "pretty printing" which would be the only exception which would produce only through a "byte pipe"
(btw. this scheme also supports piping to another instance of Elvish which might come handy in multiprocessing scenarios)
My use case involve running commands that output information that I want to output to the user, but these commands are running inside functions that produces values. The only way at the moment if to output to stderr. For example: var v = (some-function)
, you just can't get the value and let the byte output through, unless I use stderr. I could print directly to stdout or stderr of the process or even $os:dev-tty, but then the caller of a function will lose control of that output. Maybe there should be additional ports for logging?
EDIT: I found a solution by using redirections to custom ports: 1>&3 2>&4
and output it back later with 3>&1 4>&2
. It's not very well explained and the example seem broken in https://elv.sh/ref/language.html#redirection.