collection icon indicating copy to clipboard operation
collection copied to clipboard

Binary search functions should support searching by key instead of value

Open mverver-google opened this issue 4 years ago • 1 comments

I would like the algorithms library to support binary searching by key instead of by value. Currently binarySearchBy() and lowerBoundBy() already support comparing elements using a key-function, but they still require passing in a value to search for, instead of a key. This makes it unnecessarily difficult to use these functions efficiently in some cases.

Motivating use case

I have a sorted list of complex objects that are sorted by some field. I want use binary search to find an element by that field, but I want to avoid the overhead of constructing a full value.

Example:

class WeatherForecast {
  final DateTime day;
  final double minTemp;
  final double maxTemp;
  final double humidity;
  final double chanceOfRain;
  // etc.!

  WeatherForecast({
    required this.day,
    required this.minTemp,
    required this.maxTemp,
    required this.humidity,
    required this.chanceOfRain,
  });
}

void main() {
  final forecasts = <WeatherForecast>[
    WeatherForecast(
      day: DateTime.utc(2021, 7, 1),
      minTemp: 19.0,
      maxTemp: 29.0,
      humidity: 0.8,
      chanceOfRain: 0.9,
    ),
    WeatherForecast(
      day: DateTime.utc(2021, 7, 2),
      minTemp: 17.5,
      maxTemp: 26.0,
      humidity: 0.4,
      chanceOfRain: 0.2,
    ),
    // etc.
  ];
  final today = DateTime.utc(2021, 7, 2);

  // What I want to be able to do:
  //final index = forecasts.binarySearchByKey(today, (wf) => wf.day);

  // Best available alternative:
  // Note that this requires unnecessarily constructing an object (which may
  // be arbitrarily expensive/complicated depending on the number/type of
  // required fields:
  final index = forecasts.binarySearchBy(
    WeatherForecast(
      day: DateTime.utc(2021, 7, 2),
      minTemp: 0,
      maxTemp: 0,
      humidity: 0,
      chanceOfRain: 0,
    ),
    (wf) => wf.day,
  );

  if (index == -1) {
    print('Forecast not found!');
  } else {
    print(forecasts[index].minTemp);  // 17.5
  }
}

(This is using the List extension binarySearchBy from the Flutter libraries, which just forwards to the algorithms library.)

In the above example, I think it should be possible to search by date without having to construct a full WeatherForecast object.

In extreme cases, constructing a value type might even have side-effects (e.g., if the objects represent open files keyed by filename) in which case constructing the value type is undesirabl.

Implementation effort

This change should be extremely simple to implement, if the API owners are willing to expose another pair of functions, for example: binarySearchByKey() and lowerBoundByKey() (exact names are up for debate).

Implementation is simple. Consider the current implementation here: https://github.com/dart-lang/collection/blob/75a7a5510979a3cd70143af85bcc1667ee233674/lib/src/algorithms.dart#L89

All that's necessary is to pull out the line var key = keyOf(value) and make key a parameter.

Concretely, change:

int binarySearchBy<E, K>(List<E> sortedList, K Function(E element) keyOf,
    int Function(K, K) compare, E value,
    [int start = 0, int? end]) {
  end = RangeError.checkValidRange(start, end, sortedList.length);
  var min = start;
  var max = end;
  var key = keyOf(value);
  while (min < max) {
    var mid = min + ((max - min) >> 1);
    var element = sortedList[mid];
    var comp = compare(keyOf(element), key);
    if (comp == 0) return mid;
    if (comp < 0) {
      min = mid + 1;
    } else {
      max = mid;
    }
  }
  return -1;
}

into:

int binarySearchBy<E, K>(List<E> sortedList, K Function(E element) keyOf,
    int Function(K, K) compare, E value,
    [int start = 0, int? end]) {
  return binarySearchByKey<E, K>(sortedList, keyOf, compare, keyOf(value), start, end);
}

int binarySearchByKey<E, K>(List<E> sortedList, K Function(E element) keyOf,
    int Function(K, K) compare, K key,
    [int start = 0, int? end]) {
  end = RangeError.checkValidRange(start, end, sortedList.length);
  var min = start;
  var max = end;
  while (min < max) {
    var mid = min + ((max - min) >> 1);
    var element = sortedList[mid];
    var comp = compare(keyOf(element), key);
    if (comp == 0) return mid;
    if (comp < 0) {
      min = mid + 1;
    } else {
      max = mid;
    }
  }
  return -1;
}

And then exactly the same change for lowerBoundBy(), of course.

mverver-google avatar Jun 30 '21 18:06 mverver-google

I just encountered a case where something like this could be useful as well. A potential alternative API could be what rust does, which I find a bit simpler. Here's how it would look like in dart:


/// Binary searches this sorted list with a comparator function.
///
/// The comparator function should implement an order with the sort order
/// of the list, returning an order code that indicates whether its argument
/// is less, equal or greater the desired target.
///
/// If the value is found then the index of the matching element is returned.
/// If there are multiple matches, then any one of the matches could be
/// returned. If the value is not found then -1 is returned.
int binarySearchBy<E>(List<E> sortedList, int Function(E element) compare,  [int start = 0, int? end]) {
  end = RangeError.checkValidRange(start, end, sortedList.length);
  var min = start;
  var max = end;
  while (min < max) {
    var mid = min + ((max - min) >> 1);
    var element = sortedList[mid];
    var comp = compare(element);
    if (comp == 0) return mid;
    if (comp < 0) {
      min = mid + 1;
    } else {
      max = mid;
    }
  }
  return -1;
}

(I haven't actually checked if this code works but I hope you get the idea)

Instead of taking a compare function that takes two arguments and a value/key to look for, this version only takes a compare function that takes one argument. It is now up to the compare function if it wants to compare the value with another value, or derive a key and compare it. Interestingly, Rust handles the use case for lowerBound with binarySearch as well, by returning the index where the element would go if it is not found.

miDeb avatar Oct 31 '21 19:10 miDeb

We now have binarySeachBy.

lrhn avatar Sep 09 '22 12:09 lrhn