improvement-proposals SIP-70 - Flexible Varargs

Feb 28 '25 15:02 lihaoyi

Do you want to cite https://github.com/scala/improvement-proposals/pull/41 ? There are a lot of previous discussions on this feature.

This SIP is a renamed version of https://github.com/scala/improvement-proposals/pull/41 , and also adds an intermediate step to create a temporary seq.

applyBegin -> newBuilder
applyNext -> addOne
applyNextSeq -> addAll
applyEnd -> result

Mar 06 '25 21:03 Atry

I started to wander if this SIP could help resolve this issue: https://github.com/scala/scala3/issues/18009 In short, the problem there is that an invocation of a curried method added by a refinement on a subtype of Selectable, e.g.

mySelectable.foo(args1*)(args2*)

should get desugared to something like

mySelectable.applyDynamic("foo")(args1*, args2*).asInstanceOf[Foo]

which is now illegal, but would be when this SIP gets implemented

Mar 24 '25 17:03 prolativ

@odersky @sjrd @lrytz have any of you had a chance to look at this? It's been a month since reviewers were assigned so I'm hoping to get some feedback

Jun 29 '25 01:06 lihaoyi

I also think this SIP is a very nice improvement.

You mention / suggest to use IArray, which is an opaque type alias and not part of the collection hierarchy. sum(IArray.newBuilder[Int].addOne(1).result()*) wraps the resulting array using IArray.wrapIntArray, which converts it to an immutable.ArraySeq. So it's probably better to use ArraySeq directly. Or maybe Vector?

ArraySeq: can wrap a primitive array, minimal memory overhead
Vector: better if the collections is later used for appending / prepending, concatenation etc. Memory and computational overhead is small. Primitives are boxed.

I'm not sure if the newBuilder call needs an explicit type argument, or if we can have that inferred by the typer. In Scala 3:

scala> collection.immutable.ArraySeq.newBuilder.addOne(1).result()
-- [E172] Type Error: ----------------------------------------------------------
1 |collection.immutable.ArraySeq.newBuilder.addOne(1).result()
  |                                        ^
  |                                        No ClassTag available for Any
1 error found

Interestingly, the same line works on Scala 2, the compiler infers newBuilder[Int]. Is that a known limitation @odersky?

If an explicit type argument is needed, the SIP should specify how that type is obtained.

For patterns, we'll need to update the spec to say how case E(p1, ..., pN, s*, u1, ..., uM) translates. I think the section is https://scala-lang.org/files/archive/spec/3.4/08-pattern-matching.html#pattern-sequences, the spec seems not to be fully up to date though, it doesn't reflect https://github.com/scala/scala3/pull/11240.

This SIP doesn't change repeated parameters (of case classes), a repeated parameter will still be last in a case classs definition. An unapplySeq declaration should also continue to have the corresponding return type (T1, ..., Tm, Seq[S]).

So the spec needs to map the new pattern shape onto to current case class parameter list shape (or unapplySeq return type shape). Note that T1, ..., Tm can be arbitrary types, all different from each other and different from S. So I'm not sure the proposal of translating case E(p1, ..., pN, s*, u1, ..., uM) to case VarargsMatcher(Seq(p1, ..., pN), s, Seq(u1, ..., uM)) would work.

Jul 02 '25 09:07 lrytz

You mention / suggest to use IArray, which is an opaque type alias and not part of the collection hierarchy. sum(IArray.newBuilder[Int].addOne(1).result()*) wraps the resulting array using IArray.wrapIntArray, which converts it to an immutable.ArraySeq. So it's probably better to use ArraySeq directly. Or maybe Vector?

The motivation for IArray was to try and decouple the language feature from the concrete implementation of the Scala collections library: IArray seems like a relatively simple collection with relatively good performance characteristics. In particular, I wanted to:

Avoid using Seq which defaults to List which is from the start a lot less cache-friendly than IArray.
Vector has the downside of being a pretty specific collection in the standard library, with a very-non-trivial implementation, and AFAIK is something the Scala language was not aware of up to this point.

This SIP doesn't change repeated parameters (of case classes), a repeated parameter will still be last in a case classs definition.

Yes, definitions are unchanged. Not just for case classes, but for method defs as well, they all can only have one vararg* parameter at the end

An unapplySeq declaration should also continue to have the corresponding return type (T1, ..., Tm, Seq[S]).

Yes that is right. The trailing Seq[S] can be broken up into a smaller Seq in the middle and loose elements before/after it during destructuring, but these elements all continue to be homogeneously typed as S, and the signature of the unapplySeq does not need to change

Jul 07 '25 04:07 lihaoyi

Interestingly, the same line works on Scala 2, the compiler infers newBuilder[Int]. Is that a known limitation @odersky?

No, I did not know that.

Generally, I think it's better of the SIP does not specify a precise implementation scheme. The scheme might change in the future, and we should not have to change the SIP because of that. The IArray code can be considered for illustration, of course.

In terms of what we have, it looks like basing the whole thing on ArrayBuilder might work.

Jul 07 '25 08:07 odersky

IArray seems like a relatively simple collection with relatively good performance characteristics

It's ArraySeq that ends up being used – which seems a practical choice to me. Mentioning IArray is just a bit confusing, it requires the presence of an implicit conversion.

The spec says when using a sequence argument xs*, the repeated parameter is considered to be of type scala.Seq[S]. I agree using the default scala.Seq factory (List) would not be great. Maybe the SIP/spec can leave it as an implementation detail; f(x, xs*, y) is translated to Seq.newBuilder.addOne(x).addAll(xs).addOne(y).result(), but a compiler may use a different factory such that the result conforms to scala.Seq[S].

The trailing Seq[S] can be broken up into a smaller Seq in the middle and loose elements before/after it during destructuring, but these elements all continue to be homogeneously typed as S, and the signature of the unapplySeq does not need to change

Right. I was thinking what happens when using subpatterns and then got a bit lost in spec. The VarargsMatchHelper seems to work fine for that.

def f(x: Any): String = x match {
  case Seq(Some(_), y: String, xs*, 1) => y
}

// translates to

def f(x: Any): String = {
  val VarargsMatcher = new VarargsMatchHelper(2, 1)
  x match {
    case VarargsMatcher(Seq(Some(_), y: String), xs, Seq(1)) => y
  }
}

Jul 14 '25 16:07 lrytz

There is now an implementation of most of the proposal.

One doubt I am having is spreads of Option. We do not allow an option to be passed individually to a vararg parameter. I.e. the following would not work:

  def foo(x: Int*) = x.sum
  val xo = Some(2)
  f(xo*)

That's an error since the argument of a spread operator must be a Seq or an Array. Do we really want to change that? That would affect normal spreads not just spreads embedded in a longer list. There might be use cases for embedded options, but for a single spread passed to a vararg it feels a bit weird. Alternatively, we can demand an explicit toList, i.e.

  f(xo.toList*)

would work today. I'd be interested in getting the committee member's opinion on this aspect before I try to implement it.

Sep 02 '25 14:09 odersky

My motivation for including a special case for Option is that empirically it's super common to have "optional elements" in a sequence or sequence literal. Libraries like ScalaTags' Frag, OS-Lib's Shellable all allow this because it's such a common need. Other Iterables like Set or Map are not nearly as commonly spliced due to their nondeterministic ordering, a problem that Options don't have.

The use of a naked foo(opt*) is certainly a bit unusual, but it doesn't seem wrong. It doesn't seem worth it to make users jump through hoops in other scenarios by adding .toList everywhere just to prohibit the foo(opt*) scenario which is actually fine, and nobody really cares about either way anyway.

A third option is to allow splicing options in longer sequences such as foo(opt1*, opt2*), but prohibiting it the alone e.g. foo(opt*). Can be done, but again it seems like we're just adding irregularity just for the sake of it without any real benefit.

So IMO might as well allow foo(opt*) for the sake of (a) regularity and (b) convenience in other cases where it actually does matter

Sep 02 '25 15:09 lihaoyi

Perhaps a fourth alternative is that we add a new marker trait trait VarargUnpackable[T] that Option and Seq extend, and make foo(bar*) work on any VarargUnpackable. VarargUnpackable could have whatever abstract methods are necessary to efficiently participate in the flexible varargs desugaring. e.g. one possible definition is

trait VarargUnpackable[+T]{
  // Used when unpacking a single value, can  
  // efficiently return `this` for `Seq`s
  def toSeq: Seq[T]

  // Used when unpacking flexible varargs to 
  // efficiently feed the elements of this object
  // into a `Seq.builder`
  def copyToArray(dest: Array[T], startIndex: Int): Unit

  // Used when unpacking flexible varargs to allow 
  // the builder to expand its capacity precisely
  // when the size of this is known. Otherwise 
  // returns `-1` indicating the size it not known
  // and the builder would have to handle expansion
  // using its own heuristics
  def sizeHint: Int

  // When sizeHint is -1, we cannot pre-size the 
  // builder array, and so we append elements one
  // at a time to give the builder a chance to resize
  // as necessary
  def foreach(f: T => Unit): Unit
}

object VarargUnpackable{
  class Builder[T]{
    def addAll(v: VarargUnpackable[T])
    def add(v: T)
    def result(): Seq[T]
  }
}

// Desugaring
foo(bar*)
foo(VarargUnpackable.Builder().addAll(bar).build())

foo(bar*, qux*)
foo(VarargUnpackable.Builder().addAll(bar).addAll(qux).build())

foo(value1, bar*, value2, qux*, value3)
foo(VarargUnpackable.Builder().add(value1).addAll(bar).add(value2).addAll(qux).add(value3).build())

The VarargUnpackable trait above should be the minimal API necessary to efficiently build the Seq[T] we need to provide to the body of a vararg-taking method. This included allowing use of high-performance batch copy methods like System.arraycopy, for implementers of VarargUnpackable that can support them (e.g. ArraySeq, ArrayBuffer, ArrayDeque) while other collections can implement copyToArray in a more naive foreach/iterator-based fashion. The builder desugaring above is similar to the builder approach taken in Java's collection literals discussions. We could simplify VarargUnpackable's members to just def foreach if we didn't care about performance, but it may be worth the slight additional complexity to ensure we're doing things efficiently

That would avoid hardcoding Option and Seq, and have the upside that user-specified types will be able to hook into this syntax without having to extend Seq[T] themselves which we all know is very error prone and discouraged.

A fifth alternative would be to use a name-based desugaring so foo(bar*, qux*) calls methods on bar and qux without any requirement of them being a particular type. Similar to the trait-based approach above, but perhaps a bit more flexible.

Sep 03 '25 02:09 lihaoyi

Why not just let Option and Seq extend a newly introduced trait OrderedIterable[+A] extends Iterable[A] if the reason to exclude Set is ordering.

Sep 03 '25 02:09 Atry

@Atry I think it would be good to have the language feature depend on as minimal an interface as possible, so a minimal VarargUnpackable[+T] trait would be preferable since it doesn't extend Iterable and force implementers to support the rich API of the Iterable trait

Sep 03 '25 02:09 lihaoyi

I think all this goes to far. Varargs are Seq's. I see no reason to complicate things further here. Arrays can be passed to Seq's since there is an implicit conversion. The point of SIP 70 is to stay within this framework and allow multiple spreads. My PR provides that. Options unfortunately are an irregular case since there is no conversion from option to Seq (and for good reason!). I think it is much cleaner to keep that principle even if some use cases require an explicit .toList.

Here's another way to put it: I can have a vararg function and a spread argument to it:

def f(x: Int*)
f(xs*)

I can convert the vararg to a Seq and drop that the spread:

def f(x: Seq[Int])
f(xs)

That works, always. I have done that refactoring many times. But allowing options in spreads breaks it.

Generally varargs are tricky to get right since they need to be treated early during type inference when types are not yet known completely. That compounds complexity. So we need to keep it simple.

Sep 03 '25 07:09 odersky

I agree with @odersky. I don't question that there are use cases for spreading Options (I can see a few spots in my code where I would definitely use it). The irregularity is a bit too much in this case, though. Having to add a .toList in order to be able to spread it seems like an acceptable tradeoff in order to retain the relative simplicity of varargs.

Sep 03 '25 07:09 sjrd

I think it would indeed be interesting if we had a minimal interface for varargs and spreads instead of the Seq we have now. That would go in the general direction to minimize the language mandated footprint of the standard library. But it would raise a lot of issues, including how to organize migration and how to keep backwards compatibility. So I think this would have to be a different SIP.

Sep 03 '25 08:09 odersky

As a long-term Scala user, I strongly support @lihaoyi's arguments. Optiontype is so essential to everyday Scala programmer life ergonomics that I would love to see it accepted friction-free whenever possible, not less.

Sep 03 '25 08:09 arturopala

Maybe we can add .cc as a method that turns a non-collection type into a collection. name.cc* doesn't seem too bad. Even x.cc* doesn't seem too bad to me. x.toList* is a lot more effort.

If we allowed a spread type then we could use a unary * suffix operator to simply allow x.* while the ordinary collection would be x*. As it is, the vararg type isn't first-class, though, so you can't use it this way.

Sep 04 '25 04:09 Ichoran

https://github.com/scala/scala3/pull/23855 now also implements the pattern matching specified in the SIP. This means the implementation is complete, except for the contentious point whether we want to support Option. (I still think we should not).

Sep 18 '25 21:09 odersky

We can probably merge the Seq-only implementation first and follow up with the Option handling later: follow up discussion in the SIP meeting, follow up PR, or follow up SIP entirely. No need to block the meat of this proposal on an edge case when the bulk of it is uncontroversial

Sep 19 '25 02:09 lihaoyi

Let's discuss this point next week in the SIP meeting.

Sep 19 '25 06:09 odersky

@Ichoran wrote:

Maybe we can add .cc as a method that turns a non-collection type into a collection. name.cc* doesn't seem too bad. Even x.cc* doesn't seem too bad to me. x.toList* is a lot more effort.

We already have Some(42).toSeq, which is not that bad, I think...

Sep 19 '25 10:09 bjornregnell

On the September 26 meeting we agreed to remove Option spreads from this SIP proposal, that is still pending.

The implementation (https://github.com/scala/scala3/pull/23855, available as experimental in 3.8) does not have Option spreads.

Nov 28 '25 13:11 lrytz

@lrytz I updated the doc to move the Option handling under ## Limitations since we no longer support it

Nov 28 '25 16:11 lihaoyi

Please note that the SIP template has changed:

The SIP number goes into a number: NN YAML header, and titles should not be prefixed with SIP NN -
The file name should be prefixed with the SIP number, padded to three digits.

Thanks!

Nov 28 '25 17:11 bracevac