bug icon indicating copy to clipboard operation
bug copied to clipboard

Stream concat (#:::) vs append (++)

Open manojo opened this issue 7 years ago • 9 comments

The difference between concat (#:::) and append (++) on streams should be documented more clearly. The append method does say that ++ works in very subtle ways, but it breaks down in even simple cases. Consider fibs defined as below.

def fibs: Stream[Int] = Stream(0, 1) ++ (fibs.zip(fibs.tail).map { case (a, b) => a + b })

Calling fibs will result in a stack overflow, despite (as documentation says), the left hand side of ++ being a Stream. Using concat (#:::) however works as expected.

def fibs: Stream[Int] = Stream(0, 1) #::: (fibs.zip(fibs.tail).map { case (a, b) => a + b })

Since this is a subtle issue (not clear whether a bug), it might just be wise to add a warning about this behaviour higher up in the documentation, and in the official docs.

manojo avatar Jun 15 '17 10:06 manojo

If the above makes sense, I'm happy to make the relevant PRs.

manojo avatar Jun 15 '17 10:06 manojo

Some of my students encountered that case, which confused the heck out of them. I had used #::: quite naturally in my own code and hadn't thought of trying with ++. It's actually a reasonable expectation that ++ would work lazily on a stream, the complexity of the Scala library notwithstanding.

charpov avatar Jun 27 '17 20:06 charpov

once scala/collection-strawman is merged into scala/scala, let's see whether the situation has changed in LazyList (which replaces Stream)

SethTisue avatar Mar 02 '18 22:03 SethTisue

@NthPortal where do we stand, here?

SethTisue avatar Feb 20 '19 22:02 SethTisue

As a brief note on why the two are not equivalent: #::: has a lazy suffix, while ++ does not. More specifically, they are roughly equivalent to having the following two signatures1:

// please pretend this method is not right-associative
def #:::[B >: A](suffix: => LazyList[B]): LazyList[B]
def ++[B >: A](suffix: IterableOnce[B]): LazyList[B]

As you can see, the argument to ++ must be evaluated before the method is called, whereas the argument to #::: can be evaluated only when you need to, well after the method is called.


Because the state of LazyList is lazy, the argument to ++ is effectively lazy too in this case (I think), so it should work fine.

... that being said, I will still test to confirm.

I have no particular knowledge of Stream, but I have no reason to believe it should behave any differently than in 2.12.


1 for reasons which are not important here, these are not their actual definitions

NthPortal avatar Feb 21 '19 01:02 NthPortal

I made a silly logic error in my above comment. fibs makes a call to ++ which has a strict argument, and the argument includes a call to fibs. This will loop and overflow the stack regardless of the implementation of ++, because the method is recursive. It is unavoidable.

NthPortal avatar Feb 21 '19 01:02 NthPortal

I don't think this is fixable (if we're considering it a bug), unless we decide to change the signature of concat/appendedAll/++/:++ to take by-name arguments, which doesn't seem super likely to happen.

NthPortal avatar Feb 21 '19 02:02 NthPortal

You'd need something like newLL to wrap => LazyList[A] into LazyList[A]. LazyList (unlike Stream) is lazy enough that it should be possible to write such a method. But it wouldn't really help because you'd have to call this method explicitly where needed, in which case you may as well use #::: instead of ++ to fix the problem.

I'm rescheduling this to 2.14 to revisit the API situation but I agree that it's unlikely that ++ will be changed` because of the performance implications.

szeiger avatar Aug 28 '19 14:08 szeiger

Any API tweak is out of scope here, but I think the doc should emphasize the special Deferrer magic of #:: and #:::.

Nth's comment was that the difference in signature is not relevant, but I think highlighting that there is a difference early on the doc page will let the reader know that these methods are special in constructing lazy lists.

The "see also" link to the overview should say, "for less information, see the overview", because it is just a brief paragraph to say, "here's how we construct lazy lists".

Conversely, the lazy list doc could be tweaked, as I've always found it a challenge to read. The language around recursion is obscure. But in sum, I don't see how all those words can omit mentioning toDeferrer.

som-snytt avatar Apr 06 '24 15:04 som-snytt