language icon indicating copy to clipboard operation
language copied to clipboard

Switch statements on ints should accept ranges

Open Hixie opened this issue 3 years ago • 9 comments

(This is not a request from the Flutter team and should not be accorded higher priority because of my role.)

One thing I miss from Pascal is the ability to specify ranges when doing a switch over an int. For example:

switch (opcode) { // opcode is a byte
  case 0..127:
    // ...
    break;
  case 128:
    // ...
    break;
  case 129..132:
    // ...
    break;
  case 133..156:
    // ...
    break;
  default:
    // ...
}

Hixie avatar Jun 01 '21 00:06 Hixie

also:

switch (someToken.value) {
  case 'a'..'z':
    break;
  default:
}

ykmnkmi avatar Jun 01 '21 03:06 ykmnkmi

This looks like pattern-matching in some sense (but switch always does, it's just currently very trivial patterns).

Adding such a feature just for integers seem unnecessarily restrictive. If we instead allowed it for any class which implements Comparable<X> for some X, and the range ends must then implement X.

So:

switch (e) { // e has static type T which implements Comparable<X>, and value v
  case r1..r2:  // r1 and r2 are expressions of type X
    ... matches if  r1.compareTo(v) <= 0 and r2.compareTo(v) >= 0.

We can the discuss whether r1 and r2 should be constants, but if we allow method invocations anyway, we might as well allow non-constant switch cases in general while we are at it. However, then we run into parsing ambiguities with the cascade syntax. Might need to use ... or : instead of ...

What if r2 is before r1? Should we treat it as a range from r2 to r1 instead? (We'd need to check r1.compareTo(r2) at some point, and since calling compareTo isn't constant, it would have to be either the first time we evaluate the case, or every time - if the r1 and r2 expressions are not const, it probably has to be every time).

In any case, that should allow someToken.value (presumably of type String, or at least Comparable<String>) to match from "a" to "z" (which includes "aa", but not "za"). Not matching "za" *also* suggests a short-coming of this approach: It assumes the end is always inclusive. What you possibly want for the token is range from "a"to, but not including,"{"(the character afterz`)).

So, maybe we want to define two kinds of ranges, exclusive and inclusive : 1..10 and 1...9 (read ".." as "up to" and "..." as "up to including", as a memorization trick). Or 1:10 and 1::10. Or 1...10 an 1....10. Nope, last one definitely stepped over the "not readable" line. But something.

Then it would be somewhat irritating to have this perfectly good range syntax, and only being able to use it in one place. Why not for (var i in 0...9) print(digit(i));. That one seems like a slam-dunk, which suggests that a range could be a valid iterable, defined as the elements starting with a up to (possibly including) b. That only works if we have an increment operation, so only if the type T supports T operator +(int i). Numbers do (not really, but num does, and we might special case int and double as usual). Any other type which allows adding integers (anything where we can use ++) will also work.

So, the syntax e1 .. e2 or e1 ... e2 (or whichever we choose) would introduce an object of type Range with a contains method, which the switch case would use. If the type of e1, T, has a T operator+(int) method, it also implements RangeIterable which extends Range.

(Maybe a syntax like [e1:e2] would be better delimited, and look more like an iterable. We can even introduce [e1:e2:step] which adds step to e1 repeatedly until it's bigger than and/or equal to e2, with step defaulting to the integer 1).

Or, maybe just allow case in e where e implements Iterable<T> of the type of the switch, and we just call contains on the iterable. Then we don't need to use the same syntax for ranges of everything, and can do:

extension RangeIterable on int {
  Iterable<int> to(int end, [int step = 1]) => _IntRange(this, end, step, inclusive: true);
  Iterable<int> until(int end, [int step = 1]) => _IntRange(this, end, step, inclusive: false);
}
class _IntRange extends Iterable<int> {
  final int _start, _end, _step;
  final bool _inclusive;
  const _IntRange(this._start, this._end, this._step, {required bool inclusive}) : _inclusive = inclusive;
  int get length => (_end - _start + _step - (_inclusive ? 0 : 1)) ~/ _step;
  int get elementAt(int index) => _start + _step * index; // Allow extrapolation.
  bool contains(Object? value) => value is int && _start <= value && _inclusive ? _end >= value : _end > value;
  Iterator<int> get iterator => (() sync* {
    var end = _end;
    if (_inclusive) _end += 1;
    for (var i = _start; i < end; i += _step) yield i;
  })();
}

Then you could do:

switch (n) {
  case in 1.to(5): ...    // Implements Iterable<int>, so checks `1.to(5).contains(n)`.
}

With a little inlining, that should be reasonable.

lrhn avatar Jun 01 '21 08:06 lrhn

@lrhn 'a'..'z' only for single character: switch(str[0]) { ... } -> ord('a').to(ord('z')).contains(ord(value[0]))

ykmnkmi avatar Jun 01 '21 10:06 ykmnkmi

Dart doesn't have single characters, or a character type, so if 'a'..'z' is just intended as a shorthand for the integer range 0x61..0x7a, then ... I'd prefer to just introduce character code constants, like c'a' being equivalent to 'a'.runes.first, and then you can switch on value.runes.first directly.

Switching on a string is going to compare the entire string, not the first character (unless we introduce string-patters as part of pattern matching, say case RegExp(r'^[a-z]') as match: ... which matches only strings).

lrhn avatar Jun 01 '21 10:06 lrhn

case /[a-z]/ in: / case p'[a-z]' in:-> RegExp('[a-z]') in value ~ value.contains(RegExp('[a-z]'))

ykmnkmi avatar Jun 01 '21 10:06 ykmnkmi

If ranges were defined as operators then you could provide custom definitions for your own classes or extend existing classes. This might be enough to satisfy @ykmnkmi's use case without the need for an additional character type:

// using ... and ..< as in swift
extension CharacterRange on String {
  Iterable<String> operator ..<(String that) sync* {
    assert(length == 1);
    assert(that.length == 1);
    for (int c = codeUnitAt(0); c < that.codeUnitAt(0); c++) {
      yield String.fromCharCode(c);
    }
  }

  Iterable<String> operator ...(String that) sync* {
    assert(length == 1);
    assert(that.length == 1);
    for (int c = codeUnitAt(0); c <= that.codeUnitAt(0); c++) {
      yield String.fromCharCode(c);
    }
  }
}

mmcdon20 avatar Jun 01 '21 16:06 mmcdon20

FWIW, I'm looking here for a solution that I can confidently believe the compiler will optimize extremely well (e.g. table-based dispatch, jump math, or some such), for situations that are super time-critical, like a parser. I've no objection to general pattern-matching as a feature to solve a different problem, but I feel that's significant scope creep for the problem that I'm referring to in this issue.

Hixie avatar Jun 01 '21 17:06 Hixie

Dart switches can now do:

switch (opcode) { // opcode is a byte
  case <= 127:
    // ...
    break;
  case == 128:
    // ...
    break;
  case <= 132:
    // ...
    break;
  case <= 156:
    // ...
    break;
  default:
    // ...
}

which behaves like the original example would. There is no lower-end check because cases are in priority order.

I can't promise that the compilers will optimize this as much as you want, but nothing prevents them from doing so.

lrhn avatar Sep 25 '23 21:09 lrhn

You can even do ranges:

switch (opcode) { // opcode is a byte
  case >= 64 && <= 127:
    // ...
}

The big missing piece is that exhaustiveness checking doesn't understand integer ranges. It's doable, we just didn't have time for it in 3.0. I hope we'll get there eventually.

munificent avatar Jan 18 '24 02:01 munificent