bug
bug copied to clipboard
Stream concat (#:::) vs append (++)
The difference between concat (#:::
) and append (++
) on streams should be documented more clearly. The append method does say that ++
works in very subtle ways, but it breaks down in even simple cases. Consider fibs defined as below.
def fibs: Stream[Int] = Stream(0, 1) ++ (fibs.zip(fibs.tail).map { case (a, b) => a + b })
Calling fibs
will result in a stack overflow, despite (as documentation says), the left hand side of ++
being a Stream
. Using concat (#:::
) however works as expected.
def fibs: Stream[Int] = Stream(0, 1) #::: (fibs.zip(fibs.tail).map { case (a, b) => a + b })
Since this is a subtle issue (not clear whether a bug), it might just be wise to add a warning about this behaviour higher up in the documentation, and in the official docs.
If the above makes sense, I'm happy to make the relevant PRs.
Some of my students encountered that case, which confused the heck out of them. I had used #:::
quite naturally in my own code and hadn't thought of trying with ++
. It's actually a reasonable expectation that ++
would work lazily on a stream, the complexity of the Scala library notwithstanding.
once scala/collection-strawman is merged into scala/scala, let's see whether the situation has changed in LazyList
(which replaces Stream
)
@NthPortal where do we stand, here?
As a brief note on why the two are not equivalent: #:::
has a lazy suffix, while ++
does not. More specifically, they are roughly equivalent to having the following two signatures1:
// please pretend this method is not right-associative
def #:::[B >: A](suffix: => LazyList[B]): LazyList[B]
def ++[B >: A](suffix: IterableOnce[B]): LazyList[B]
As you can see, the argument to ++
must be evaluated before the method is called, whereas the argument to #:::
can be evaluated only when you need to, well after the method is called.
Because the state of LazyList
is lazy, the argument to ++
is effectively lazy too in this case (I think), so it should work fine.
... that being said, I will still test to confirm.
I have no particular knowledge of Stream
, but I have no reason to believe it should behave any differently than in 2.12.
1 for reasons which are not important here, these are not their actual definitions
I made a silly logic error in my above comment. fibs
makes a call to ++
which has a strict argument, and the argument includes a call to fibs
. This will loop and overflow the stack regardless of the implementation of ++
, because the method is recursive. It is unavoidable.
I don't think this is fixable (if we're considering it a bug), unless we decide to change the signature of concat
/appendedAll
/++
/:++
to take by-name arguments, which doesn't seem super likely to happen.
You'd need something like newLL
to wrap => LazyList[A]
into LazyList[A]
. LazyList (unlike Stream) is lazy enough that it should be possible to write such a method. But it wouldn't really help because you'd have to call this method explicitly where needed, in which case you may as well use #:::
instead of ++
to fix the problem.
I'm rescheduling this to 2.14 to revisit the API situation but I agree that it's unlikely that ++
will be changed` because of the performance implications.
Any API tweak is out of scope here, but I think the doc should emphasize the special Deferrer
magic of #::
and #:::
.
Nth's comment was that the difference in signature is not relevant, but I think highlighting that there is a difference early on the doc page will let the reader know that these methods are special in constructing lazy lists.
The "see also" link to the overview should say, "for less information, see the overview", because it is just a brief paragraph to say, "here's how we construct lazy lists".
Conversely, the lazy list doc could be tweaked, as I've always found it a challenge to read. The language around recursion is obscure. But in sum, I don't see how all those words can omit mentioning toDeferrer
.