language icon indicating copy to clipboard operation
language copied to clipboard

Iterable pattern

Open rrousselGit opened this issue 4 months ago • 10 comments

TL;DR: Dart lacks some way to do pattern matching on Iterable directly

When we use Iterables without transforming said Iterable into a List/Set/... (such as to avoid allocating a large collection), we end-up lacking a mean to use pattern matching.
Could we maybe have some Iterable-specific way to express a pattern match?

We could imagine #(a, b, c) as syntax for such pattern:

final list = [1, 2, 3];
final iterable = list.where((e) => e.isOdd);

switch (iterable) {
  case #(1, 3):
}

Or course, the token used could be anything else. Like #[], or not even use #.
Relying on () makes sense to me because Iterable.toString() uses (). But of course, this (1, 2) is taken by Record patterns.

Bonus:

  • Support #(1, 2, ...)
  • Support #(1, 2, ...final other) where other is an Iterable

rrousselGit avatar Aug 22 '25 01:08 rrousselGit

You shouldn't iterate an Iterable more than once.

It's possible that a pattern match can satisfy that by keeping an iterator and all previously seen values in its internal state, the same way out does for lists. It will effectively create a list of the elements as they are tested. But list patterns also cache all values they test, so that's not new.

You can almost get the same effect by having an unmodifiable LazyList class that builds itself from an iterator when it's accessed. If you never access above index n, it will only iterate to that. (Reading length will immediately read the entire iterable and build the complete list.) Then you could have an extension getters .eagerList and .lazyList to convert an iterable to a to list, and you can use them in patterns as Iterable(lazyList: list-pattern)

Except that list patterns do .length>n all the time to avoid access errors. If lists had a hasIndex(n), then it could see what the length of used for, instead of having to give the full length to someone who only needs to know if it's greater than 2, just like isNotEmpty is used instead of .length > 0. Too bad.

lrhn avatar Aug 22 '25 06:08 lrhn

Pattern matching makes the most sense when you're matching on is an immutable blob of data. In fact, Dart goes way out of its way to manufacture those semantics even when the underlying object is actually mutable in order for exhaustiveness checking to work.

Given that, I don't think matching directly on an Iterable makes much sense. The value of an Iterable over a List is that the computation is deferred, but with pattern matching, you kind of want the computation to be done anyway. I suppose it could be useful to match on an Iterable if the patterns are always careful to use something like ... so that you never need to access the length or walk the entire sequence. It would basically be a way to match against the head of a lazy, potentially infinite collection.

But that feels like a fairly marginal use case to me. If that's what you need, then probably the easiest way to express that is:

final list = [1, 2, 3];
final iterable = list.where((e) => e.isOdd);

switch (iterable.take(2).toList()) {
  case [1, 3]:
}

In other words, eagerly grab the prefix you care about to a list, and then go from there.

munificent avatar Aug 29 '25 00:08 munificent

I suppose it could be useful to match on an Iterable if the patterns are always careful to use something like ... so that you never need to access the length or walk the entire sequence. It would basically be a way to match against the head of a lazy, potentially infinite collection.

In haskell the built in list data structure is a lazily evaluated potentially infinite sequence, and you can do pattern matching on it.

I once translated the haskell program from this paper into dart. The original program makes extensive use of pattern matching on lazy lists.

My first attempt was using dart lists. The program basically generates a list of all possible solutions to a problem, and then prints out only the first solution. In haskell printing out the first entry in the solution list only computes the first solution because the lists are lazy, but since in dart lists are not lazy all of the solutions get generated.

My second attempt was a rewrite using dart iterables. This version successfully uses lazy evaluation to avoid generating the rest of the solutions list. However I had to use some tricks to make pattern matching on iterables work in dart.

Let's look at one of the functions in the program called exprs.

-- original haskell implementation
exprs     :: [Int] → [Expr]
exprs []  =  []
exprs [n] =  [Val n]
exprs ns  =  [e | (ls, rs) ← nesplit ns, 
                  l ← exprs ls, 
                  r ← exprs rs, 
                  e ← combine l r]
// dart using list
List<Expr> exprs(List<int> list) => switch (list) {
      [] => [],
      [var n] => [Val(n)],
      var ns => [
          for (var (ls, rs) in nesplit(ns))
            for (var l in exprs(ls))
              for (var r in exprs(rs))
                for (var e in combine(l, r)) e
        ],
    };
// dart using iterable
Iterable<Expr> exprs(Iterable<int> iterable) => switch (iterable) {
      Iterable(isEmpty: true) => empty,
      Iterable(first: var n, rest: Iterable(isEmpty: true)) => single(Val(n)),
      var ns => () sync* {
          for (var (ls, rs) in nesplit(ns)) {
            for (var l in exprs(ls)) {
              for (var r in exprs(rs)) {
                for (var e in combine(l, r)) {
                  yield e;
                }
              }
            }
          }
        }(),
    };

// requires an extension method for rest

extension<A> on Iterable<A> {
  Iterable<A> get rest => skip(1);
}
const Iterable<Never> empty = [];
Iterable<A> single<A>(A elem) => [elem];

With the proposed iterable pattern we could simplify to

// with iterable pattern
Iterable<Expr> exprs(Iterable<int> iterable) => switch (iterable) {
      #() => empty,
      #(var n) => single(Val(n)),
      var ns => () sync* {
          for (var (ls, rs) in nesplit(ns)) {
            for (var l in exprs(ls)) {
              for (var r in exprs(rs)) {
                for (var e in combine(l, r)) {
                  yield e;
                }
              }
            }
          }
        }(),
    };

Which makes the patterns simpler, but the code could be improved further if we also had iterable literals preferably with a syntax that mirrors the iterable patterns.

// with iterable pattern and iterable literal
Iterable<Expr> exprs(Iterable<int> iterable) => switch (iterable) {
      #() => #(),
      #(var n) => #(Val(n)),
      var ns => #(
          for (var (ls, rs) in nesplit(ns)) 
            for (var l in exprs(ls)) 
              for (var r in exprs(rs)) 
                for (var e in combine(l, r)) e
        ),
    };

The example in the original post can technically be written as the following in current dart, using the same techniques as I used in my program I discuss above.

void main() {
  final list = [1, 2, 3];
  final iterable = list.where((e) => e.isOdd);

  switch (iterable) {
    case Iterable(first: 1, rest: Iterable(first: 3, rest: Iterable(isEmpty: true))):
      // ...
  }
}

extension<A> on Iterable<A> {
  Iterable<A> get rest => skip(1);
}

But I don't think it's a good solution.


But that feels like a fairly marginal use case to me. If that's what you need, then probably the easiest way to express that is:

final list = [1, 2, 3];
final iterable = list.where((e) => e.isOdd);

switch (iterable.take(2).toList()) {
  case [1, 3]:
}

In other words, eagerly grab the prefix you care about to a list, and then go from there.

This is not the same pattern because it will still match if there are more than 2 elements. In other words its equivalent to case #(1, 3, ...): not case #(1, 3):.

mmcdon20 avatar Aug 31 '25 06:08 mmcdon20

Dart iterables are not a good match for lazy linked lists. They have a length. A better match would be Iterator, but it's not a type you often use directly.

You could create a type for linked lists, and a lazy version, and do pattern matching on the Cons object. Sadly can't use records, they don't work well with recursive typing.

lrhn avatar Aug 31 '25 08:08 lrhn

Dart iterables are not a good match for lazy linked lists. They have a length.

I’m not convinced that the presence of a .length getter disqualifies Iterable from being a match target. Haskell lists also expose length, and calling it on an infinite list will blow up—but pattern matching doesn’t rely on that. It’s entirely possible to match lazily without ever invoking .length.

switch (v) {
  // list patterns
  case []:
  case [var x]:
  case [var x, ...]:
  case [var x, ...var xs]:
  case [var a, var b]:
      
  // iterable patterns
  case Iterable(isEmpty: true):
  case Iterable(first: var x, rest: Iterable(isEmpty: true)):
  case Iterable(first: var x):
  case Iterable(first: var x, rest: var xs):
  case Iterable(first: var a, rest: Iterable(first: var b, rest: Iterable(isEmpty: true))):
}  

extension<A> on Iterable<A> {
  Iterable<A> get rest => skip(1);
}

The key restriction for Iterable patterns would be that ... can only appear at the end—i.e., case #(var x, ...var xs): is valid, but case #(...var xs, var x): would not be. This avoids the need for full evaluation or backtracking.

mmcdon20 avatar Sep 02 '25 01:09 mmcdon20

That looks like it would work, but it's unlikely to be efficient.

And I'm not sure skip(1) is the implementation you want. It doesn't distinguish between moving past the end, and doing a lookup on the first value requires iteratinig past all prior positions to get to it.

If you do:

  switch (iterable) {
    case Iterable(rest: Iterable(rest: Iterable(rest: Iterable(rest: var skip4)))):  /// ...?...
    case (Iterable(first: 42)): // Great!
  }

then you create iterable.skip(1), iterable.skip(1).skip(1), iterable.skip(1).skip(1).skip(1) and iterable.skip(1).skip(1).skip(1).skip(1). (At least the runtime is likely to optimize that to iterable.skip(1), iterable.skip(2), iterable.skip(3) and iterable.skip(4), because it recognizes when you call skip on the result of calling skip.)

The pattern above looks like it checks that there are at least four elements. It doesn't. If there are only two, the pattern works, and skip4 is just empty. So you have to add an isNotEmpty: true before you do first: ..., otherwise it might throw at runtime.

  switch (iterable) {
    case Iterable(rest: Iterable(rest: Iterable(rest: Iterable(isNotEmpty: true, first: == 4)))): /// ...
    case Iterable(rest: Iterable(rest: Iterable(isNotEmpty: true, first: == 3))):  /// ....
    case Iterable(rest: Iterable(isNotEmpty: true, first: == 2)):  /// ....
    case Iterable(isNotEmpty: true, first: == 1):  /// ....
  }

This will create four iterables of length 4, 3, 2 or 1, then iterate each twice, once for isNotEmpty and once to get the value.

It's far from the efficiency of iterating a:

class Cons<T> {
   T car;
   Cons<T>? cdr;
   Cons(this.car, [this.cdr]);
}

using something like

Cons<T> swapFirst<T>(Cons<T> list) =>
   switch (list) {
      case Cons(car: var car1, cdr: Cons(car: car2, :var cdr)) => Cons(car2, Cons(car1, cdr)),
      case c => c,
   };

The lazy evaluation of an Iterable makes it ill suited to pattern matching, because there is no iteration functionality on the iterable itself, only lazy skipping. All you can do is create more wrappers until you ask for an actual iterator. The Iterator is the thing that can be actively iterated.

The only way I can see an iterable pattern working is by acting on a single iterator. That could work, but I'm not sure it's as valuable in Dart as it would be in other, more functional, languages.

lrhn avatar Sep 02 '25 17:09 lrhn

There's always going to be an impedance mismatch when porting code from one paradigm to another. While I do think it's important for Dart to have good support for programming in a "functional style" (for some suitably hand-wavey definition of that), going all the way to a full Haskell "laziness + cons lists for sequences" is still a big jump. I'm not surprised that Dart doesn't support that style very gracefully. It really wasn't designed to.

It is the case that Iterables are lazy in Dart, but I personally think that's mostly an API abstraction that leaks out. In most cases, operations on lists that return lazy iterables eventually get "forced" to something eager by a trailing .toList() or other similar operation. The main reason that where(), map(), etc. are lazy is that you can chain them without allocating a large list for every operation and instead only do that once at the end. It's sort of a poor man's [loop fusion](https://haskell.foundation/hs-opt-handbook.github.io/src/Preliminaries/what_makes_fast_hs.html#canonical-fusion.

We could argue that Dart should support this kind of lazy-Haskell-ish style well, but it's hard. I think any tool ultimately has to make choices about what affordances are easiest for users at the expense of others. There's only so much grammar to go around and allocating some for one paradigm comes at the expense of others and adds to the total load of the language.

I've also ported some very elegant lazy Haskell code to Dart and found it challenging to do so correctly. But ultimately, I found it most useful to try to port it in a way where the resulting code was idiomatic Dart, even if it ended up more verbose. (On the other hand, I found it a hell of a lot easier to reason about the performance once I had. Lazy languages are not exactly illuminating when it comes to runtime speed.)

munificent avatar Sep 04 '25 00:09 munificent

I ended up writing a third version of my program for comparison. This time using a custom lazy linked list data structure, where the lazy links are backed by an Iterator.

//////////////////////////////////
/// Custom Lazy LinkedList Impl //
//////////////////////////////////

sealed class LinkedList<A> extends Iterable<A> {
  factory LinkedList.of(List<A> elements) =>
      elements.reversed.fold<LinkedList<A>>(const Nil(), (a, b) => Cons(b, a));
  factory LinkedList.empty() => const Nil();
  factory LinkedList.generate(Iterable<A> Function() generator) =>
      LinkedList(generator().iterator);
  factory LinkedList(Iterator<A> iterator) =>
      iterator.moveNext() ? _LazyCons(iterator) : LinkedList.empty();

  const LinkedList._();

  Iterable<A> get iterable sync* {
    LinkedList<A> current = this;
    while (true) {
      switch (current) {
        case Nil():
          return;
        case Cons<A> c:
          yield c.head;
          current = c.tail;
      }
    }
  }

  @override
  Iterator<A> get iterator => iterable.iterator;

  @override
  bool operator ==(Object other) => switch (this) {
        Cons(:var head, :var tail) =>
          other is Cons<A> && other.head == head && other.tail == tail,
        Nil() => other is Nil,
      };

  @override
  int get hashCode => Object.hashAll(this);

  @override
  String toString() => switch (this) {
        _LazyCons(:var head, _tail: null) => '$head, ...',
        Cons(:var head, :var tail) => '$head, $tail',
        Nil() => 'Nil',
      };
}

sealed class Cons<A> extends LinkedList<A> {
  const Cons._() : super._();
  const factory Cons(A head, LinkedList<A> tail) = _StrictCons;

  A get head;
  LinkedList<A> get tail;
}

class _StrictCons<A> extends Cons<A> {
  const _StrictCons(this.head, this.tail) : super._();

  @override
  final A head;

  @override
  final LinkedList<A> tail;
}

class _LazyCons<A> extends Cons<A> {
  _LazyCons(this._iterator)
      : head = _iterator.current,
        super._();

  final Iterator<A> _iterator;
  LinkedList<A>? _tail;

  @override
  final A head;

  @override
  LinkedList<A> get tail {
    if (_tail != null) {
      return _tail!;
    }
    if (_iterator.moveNext()) {
      return _tail = _LazyCons(_iterator);
    }
    return _tail = const Nil();
  }
}

class Nil extends LinkedList<Never> {
  const Nil() : super._();
}

// example usage
void main() {
  // infinite list of natural numbers
  LinkedList<int> naturalNumbers = LinkedList.generate(() sync* {
    for (int i = 1; ; i++) {
      yield i;
    }
  });

  switch (naturalNumbers) {
    case Nil():
      print('empty');
    case Cons(head: var a, tail: Nil()):
      print('single: $a');
    case Cons(head: var a, tail: Cons(head: var b, tail: Nil())):
      print('double: $a, $b');
    case Cons(head: var a, tail: Cons(head: var b, tail: Cons(head: var c))):
      print('multi: $a, $b, $c, ...');
  }
}
multi: 1, 2, 3, ...

The solution is less hacky and more of a proper implementation if you will, but performance at least in this case appears to be worse than the hacky extension on Iterable approach.

first attempt (list) second attempt (iterable) third attempt (lazy linked list)

mmcdon20 avatar Sep 11 '25 01:09 mmcdon20

The reason the hacky version seems efficient is probably that it's actually backed by a list. If you know that, you can just use list patterns. Try a larger example and an iterable that is not a list (even if it's just theList.where((_) => true)).

Consider a more thorough benchmark like this:

class Cons<T> {
  final T value;
  final Cons<T>? next;
  Cons(this.value, [this.next]);
  String toString() =>
      '[${[for (Cons? c = this; c != null; c = c.next) c.value].join(',')}]';
}

extension<T> on Iterable<T> {
  Iterable<T> get rest => skip(1);
}

int sumIterable(Iterable<int> values) {
  var current = values;
  var result = 0;
  while (current.isNotEmpty) {
    result += current.first;
    current = current.rest;
  }
  return result;
}

int sumCons(Cons<int>? values) {
  var current = values;
  var result = 0;
  while (current != null) {
    result += current.value;
    current = current.next;
  }
  return result;
}

void main() {
  Cons<int>? kiloCons = null;
  for (var i = 1000; i > 0; i--) {
    kiloCons = Cons(i, kiloCons);
  }
  kiloCons!;
  int runCons() => sumCons(kiloCons);
  var kiloList = [for (var i = 1; i <= 1000; i++) i];
  int runList() => sumIterable(kiloList);
  var kiloIterable = kiloList.where((_) => true);
  int runIterable() => sumIterable(kiloIterable);
  // Warmup.
  bench("Cons", runCons, 500500, 100);
  bench("Iterable(List)", runList, 500500, 100);
  bench("Iterable", runIterable, 500500, 100);
  // Benchmark.
  bench("Cons", runCons, 500500, 1500);
  bench("Iterable(List)", runList, 500500, 1500);
  bench("Iterable", runIterable, 500500, 1500);
}

void bench(
  String name,
  Object? Function() action,
  Object? expectedResult,
  int limit,
) {
  var c = 0, e = 0, n = 100, r = null, sw = Stopwatch()..start();
  do {
    for (var i = 0; i < n; i++) {
      r = action();
    }
    e = sw.elapsedMilliseconds;
    c += n;
    n *= 2;
  } while (e < limit);
  if (expectedResult != r) throw "Bad result: $name: $r != $expectedResult";
  if (limit > 200) { // Not warm-up.
    print("$name: ${(c / e).toStringAsFixed(2)} /ms");
  }
}

This uses the same get rest => skip(1) to treat an iterable as an iterator, both on a list and on a non-list iterable.

On my JIT VM, the Cons computation is 15 times faster than the Iterable-which-is-List , which is 250 tims faster than the pure Iterable version:

Cons: 753.24 /ms
Iterable(List): 53.67 /ms
Iterable: 0.19 /ms

The Cons operation is almost 4000 times faster than the iterable.

When AOT compiled, Cons is even faster and the other ones are even slower..

lrhn avatar Sep 11 '25 11:09 lrhn

@lrhn

The reason the hacky version seems efficient is probably that it's actually backed by a list. If you know that, you can just use list patterns. Try a larger example and an iterable that is not a list (even if it's just theList.where((_) => true)).

For my use case I needed a data structure that was lazy. The benefit of the Iterable pattern is that you can match against the front of the collection without evaluating the entire collection. You can even match against an infinite collection. If you use a list pattern, or your version of a linked list implementation you don't get that.

That said, you are correct that the Iterable extensions approach performs terribly in a lot of situations like summing a large collection.

A better comparison would be this. Here I create an infinite iterable of numbers 0, 1, 2, ..., and an infinite lazy linked list of the same numbers, and then benched summing the first 1000 entries. The lazy linked list backed by an Iterator far outperforms the Iterable in this test.

mmcdon20 avatar Sep 11 '25 14:09 mmcdon20