go icon indicating copy to clipboard operation
go copied to clipboard

proposal: x/exp/slices: Allow different types for haystack/needle in BinarySearchFunc

Open Merovius opened this issue 1 year ago • 4 comments

I propose relaxing the constraints on BinarySearchFunc to allow using two different types:

func BinarySearchFunc[E, T any](x []E, target T, cmp func(E, T) int) (int, bool)

As it adds a type parameter, this is not backwards compatible, but it's exp and the extra type argument is inferable. We could also delay this until, if ever, we move the slices package to the stdlib.

Rationale

A usecase this just came up for me is that I wrote a helper package for Advent of Code to deal with sparse interval sets. A set is represented as a slice of intervals [a,b), [c,d),…, [y,z) with a<b<…<y<z. Looking up an element in this set is done via a binary search for an interval containing it. Similarly, intersections and unions of these sets involve a bunch of binary searching. In all of these cases, the algorithm naturally looks up an integer, not an interval.

With the current signature, the union version would be written as

lo, lok := slices.BinarySearchFunc(s, Interval[T]{i.Min, i.Min}, func(i, j Interval[T]) int {
	if j.Min > i.Max {
		return -1
	}
	if j.Min+1 < i.Min {
		return 1
	}
	return 0
})
// vs.
lo, lok := slices.BinarySearchFunc(s, i.Min, func(i Interval[T], v T) int {
	if v > j.Max {
		return -1
	}
	if v+1 < j.Min {
		return 1
	}
	return 0
}

To me, the second version is far more clearer. It makes obvious which argument is the needle and which is the haystack element. It does not involve creating an artificial Interval, nor taking a .Min to extract the needle from said Interval again (and from the code, it's really not clear whether that's semantically meaningful).

The two type parameter version seems far more natural to me and anecdotally, I believe it would have always been the right choice for me when I looked at slices.BinarySearchFunc. And I remember not using it at least once or twice because of this (instead resorting to sort.Search). As another example, if I have a slice of Person, sorted by name, looking up the index by name would be

slices.BinarySearchFunc(s, Person{Name:name}, func(a, b Person) int { return cmp(a.Name, b.Name) })
// vs.
slices.BinarySearchFunc(s, name, func(p Person, n name) int { return cmp(a.Name, n) })

Again, this just seems clearer to me.

Merovius avatar Dec 15 '22 23:12 Merovius