go icon indicating copy to clipboard operation
go copied to clipboard

proposal: slices: new standard library package based on x/exp/slices

Open ianlancetaylor opened this issue 2 years ago • 34 comments

In #45955 I proposed a new slices package in the standard library. After considerable discussion there and in discussion #47203 we settled on an API and added the experimental package golang.org/x/exp/slices.

We now have experience with this package in practice. pkg.go.dev reports that it is imported by 1,541 other packages. I propose that it is now time to move this package into the standard library for the 1.21 release.

Doing this doesn't mean that we can't add more functions to the standard library slices package later. Any such additions should be suggested in their own separate proposal.

The specific API that we will put into the standard library appears below.

This description of the API refers to constraints.Ordered in a few places; if the constraints package is not present in the standard library, that reference will be replaced by an unexported constraint with the same set of types.

Note that we should either accept or reject #57348 before we accept this proposal. If we accept #57348, the appropriate change will be made here.

I'm not aware of any other proposal to modify the existing slices package API. There are a few proposals to add new functions to the slices package (at least #52434, #53987, #54768, #56353); those proposals can move from x/exp/slices to just plain slices.

Proposed API:

// Equal reports whether two slices are equal: the same length and all
// elements equal. If the lengths are different, Equal returns false.
// Otherwise, the elements are compared in increasing index order, and the
// comparison stops at the first unequal pair.
// Floating point NaNs are not considered equal.
func Equal[E comparable](s1, s2 []E) bool

// EqualFunc reports whether two slices are equal using a comparison
// function on each pair of elements. If the lengths are different,
// EqualFunc returns false. Otherwise, the elements are compared in
// increasing index order, and the comparison stops at the first index
// for which eq returns false.
func EqualFunc[E1, E2 any](s1 []E1, s2 []E2, eq func(E1, E2) bool) bool

// Compare compares the elements of s1 and s2.
// The elements are compared sequentially, starting at index 0,
// until one element is not equal to the other.
// The result of comparing the first non-matching elements is returned.
// If both slices are equal until one of them ends, the shorter slice is
// considered less than the longer one.
// The result is 0 if s1 == s2, -1 if s1 < s2, and +1 if s1 > s2.
// Comparisons involving floating point NaNs are ignored.
func Compare[E constraints.Ordered](s1, s2 []E) int

// CompareFunc is like Compare but uses a comparison function
// on each pair of elements. The elements are compared in increasing
// index order, and the comparisons stop after the first time cmp
// returns non-zero.
// The result is the first non-zero result of cmp; if cmp always
// returns 0 the result is 0 if len(s1) == len(s2), -1 if len(s1) < len(s2),
// and +1 if len(s1) > len(s2).
func CompareFunc[E1, E2 any](s1 []E1, s2 []E2, cmp func(E1, E2) int) int

// Index returns the index of the first occurrence of v in s,
// or -1 if not present.
func Index[E comparable](s []E, v E) int

// IndexFunc returns the first index i satisfying f(s[i]),
// or -1 if none do.
func IndexFunc[E any](s []E, f func(E) bool) int

// Contains reports whether v is present in s.
func Contains[E comparable](s []E, v E) bool

// ContainsFunc reports whether at least one
// element e of s satisfies f(e).
func ContainsFunc[E any](s []E, f func(E) bool) bool

// Insert inserts the values v... into s at index i,
// returning the modified slice.
// In the returned slice r, r[i] == v[0].
// Insert panics if i is out of range.
// This function is O(len(s) + len(v)).
func Insert[S ~[]E, E any](s S, i int, v ...E) S

// Delete removes the elements s[i:j] from s, returning the modified slice.
// Delete panics if s[i:j] is not a valid slice of s.
// Delete modifies the contents of the slice s; it does not create a new slice.
// Delete is O(len(s)-j), so if many items must be deleted, it is better to
// make a single call deleting them all together than to delete one at a time.
// Delete might not modify the elements s[len(s)-(j-i):len(s)]. If those
// elements contain pointers you might consider zeroing those elements so that
// objects they reference can be garbage collected.
func Delete[S ~[]E, E any](s S, i, j int) S

// Replace replaces the elements s[i:j] by the given v, and returns the
// modified slice. Replace panics if s[i:j] is not a valid slice of s.
func Replace[S ~[]E, E any](s S, i, j int, v ...E) S

// Clone returns a copy of the slice.
// The elements are copied using assignment, so this is a shallow clone.
func Clone[S ~[]E, E any](s S) S

// Compact replaces consecutive runs of equal elements with a single copy.
// This is like the uniq command found on Unix.
// Compact modifies the contents of the slice s; it does not create a new slice.
// When Compact discards m elements in total, it might not modify the elements
// s[len(s)-m:len(s)]. If those elements contain pointers you might consider
// zeroing those elements so that objects they reference can be garbage collected.
func Compact[S ~[]E, E comparable](s S) S

// CompactFunc is like Compact but uses a comparison function.
func CompactFunc[S ~[]E, E any](s S, eq func(E, E) bool) S

// Grow increases the slice's capacity, if necessary, to guarantee space for
// another n elements. After Grow(n), at least n elements can be appended
// to the slice without another allocation. If n is negative or too large to
// allocate the memory, Grow panics.
func Grow[S ~[]E, E any](s S, n int) S

// Clip removes unused capacity from the slice, returning s[:len(s):len(s)].
func Clip[S ~[]E, E any](s S) S

// Sort sorts a slice of any ordered type in ascending order.
// Sort may fail to sort correctly when sorting slices of floating-point
// numbers containing Not-a-number (NaN) values.
// Use slices.SortFunc(x, func(a, b float64) bool {return a < b || (math.IsNaN(a) && !math.IsNaN(b))})
// instead if the input may contain NaNs.
func Sort[E constraints.Ordered](x []E)

// SortFunc sorts the slice x in ascending order as determined by the less function.
// This sort is not guaranteed to be stable.
//
// SortFunc requires that less is a strict weak ordering.
// See https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings.
func SortFunc[E any](x []E, less func(a, b E) bool)

// SortStableFunc sorts the slice x while keeping the original order of equal
// elements, using less to compare elements.
func SortStableFunc[E any](x []E, less func(a, b E) bool)

// IsSorted reports whether x is sorted in ascending order.
func IsSorted[E constraints.Ordered](x []E) bool

// IsSortedFunc reports whether x is sorted in ascending order, with less as the
// comparison function.
func IsSortedFunc[E any](x []E, less func(a, b E) bool) bool

// BinarySearch searches for target in a sorted slice and returns the position
// where target is found, or the position where target would appear in the
// sort order; it also returns a bool saying whether the target is really found
// in the slice. The slice must be sorted in increasing order.
func BinarySearch[E constraints.Ordered](x []E, target E) (int, bool)

// BinarySearchFunc works like BinarySearch, but uses a custom comparison
// function. The slice must be sorted in increasing order, where "increasing" is
// defined by cmp. cmp(a, b) is expected to return an integer comparing the two
// parameters: 0 if a == b, a negative number if a < b and a positive number if
// a > b.
func BinarySearchFunc[E any](x []E, target E, cmp func(E, E) int) (int, bool)

ianlancetaylor avatar Dec 21 '22 22:12 ianlancetaylor

Note that we should either accept or reject #57348 before we accept this proposal. If we accept #57348, the appropriate change will be made here.

Out of interest: Is there code which would compile with the current version but would stop compiling with my proposal? I can't really think of how that would be the case, because any code using slices.BinarySearchFunc should just be inferred to two identical type arguments, no? [edit] I mean, I said on that proposal that it's not backwards compatible, as if it's self-explanatory, so I must've thought of something, but I can't remember what… [/edit]

Merovius avatar Dec 21 '22 23:12 Merovius

it is imported by 1,541 other packages

Usage statistics per function would be interesting. If there's a function that isn't used at all, that could be a sign that it's not necessary.

fzipp avatar Dec 21 '22 23:12 fzipp

I don't know if it has been brought up, but Contains fits into the family None/Any/All - if we had all three, it would be Any. I've recently found myself needing All a couple of times. I bring this up not to suggest adding the others, but because I can't think of how I'd name the others if we stick with the name Contains.

I also find it a bit of a rough edge that it's sometimes less func(a, b T) bool and sometimes cmp func(a, b T) int. If I want a slice of things which can both be sorted and searched (and let's be honest, anything that needs to be searched will have to be sorted first) I'll have to write two, mostly redundant functions.

Merovius avatar Dec 21 '22 23:12 Merovius

I don't know if it has been brought up, but Contains fits into the family None/Any/All - if we had all three, it would be Any.

I think it was discussed in https://github.com/golang/go/issues/53983

fzipp avatar Dec 21 '22 23:12 fzipp

Fair enough.

Merovius avatar Dec 21 '22 23:12 Merovius

Is there code which would compile with the current version but would stop compiling with my proposal?

I had to think about it, but I think the answer is yes:

    f := slices.BinarySearchFunc[int]

We can't infer the second type argument in the absence of any actual arguments.

ianlancetaylor avatar Dec 22 '22 00:12 ianlancetaylor

It's a pity we can't have func SortFunc[E any, EP E | *E](x []E, less func(a, b EP) bool). It can be kind of expensive to pass largish elements by value.

ghost avatar Dec 22 '22 03:12 ghost

Can we get a decision on #52427 before this is merged, to nail down the constraints package issue?

earthboundkid avatar Dec 22 '22 06:12 earthboundkid

It's a pity we can't have func SortFunc[E any, EP E | *E](x []E, less func(a, b EP) bool). It can be kind of expensive to pass largish elements by value.

Isn‘t only the header passed by value? Then size shouldn‘t make a difference regarding performance.

andig avatar Dec 22 '22 06:12 andig

It's a pity we can't have func SortFunc[E any, EP E | *E](x []E, less func(a, b EP) bool). It can be kind of expensive to pass largish elements by value.

Isn‘t only the header passed by value? Then size shouldn‘t make a difference regarding performance.

Not the slice, the comparison function.

ericlagergren avatar Dec 22 '22 07:12 ericlagergren

Why is there a stable variant of SortFunc, but not of Sort?

Edit: I think I found the answer. It's because only value types can satisfy constraints.Ordered.

fzipp avatar Dec 22 '22 07:12 fzipp

@fzipp Arguably there should be.

As usually, the special case are floating point numbers, where distinguishable elements can still compare as equal. Distinguishing the stable/unstable sorting result requires using math.Signbit in this case or might require math.Float64bits for NaN, but they can be distinguished.

But it seems very much a corner case and people who care can then still result to SortStableFunc.

For all other types satisfying constraints.Ordered, the stable/unstable sorting result will always be indistinguishable.

Merovius avatar Dec 22 '22 09:12 Merovius

We now have experience with this package in practice.

I would have liked to hear more experience reports from people who have used this package. But maybe the silence just means they're happy so far.

gophun avatar Dec 22 '22 09:12 gophun

@gophun

I've been using the slices package over the last year or so whenever I thought of it. In particular, I've been using it a bunch for Advent of Code over the last three weeks. That's a small sample and it's code for a specific kind of problem, but OTOH it's all fresh code from scratch, so it's probably the most representative sample I have at hand for how I write code "right now". The TL;DR is that I use or could use almost all of its API and I use it quite frequently (which also feels true for my day job).

Stats/Details Grepping through the source, I call:
  4 Clone
  4 Index
  2 Sort
  3 SortFunc
  3 SortStableFunc

I would have used slices.BinarySearchFunc 5 times, if #57348 was accepted. That's ~20 calls in ~2700 SLOC, which is a decent amount. It's also one of the most-imported Go project packages in that codebase:

  1 	"crypto/sha1"
  1 	"encoding/base64"
  1 	"encoding/binary"
  1 	"encoding/json"
  1 	"golang.org/x/net/publicsuffix"
  1 	"image"
  1 	"image/color"
  1 	"image/png"
  1 	"io/fs"
  1 	"net/http"
  1 	"net/http/cookiejar"
  1 	"net/url"
  1 	"reflect"
  1 	"regexp"
  1 	"sort"
  1 	"sync"
  2 	"context"
  2 	"path/filepath"
  2 	"testing"
  2 	"time"
  2 	"unicode/utf8"
  3 	"bufio"
  4 	"bytes"
  5 	"flag"
  5 	"golang.org/x/exp/constraints"
  6 	"strconv"
  9 -->	"golang.org/x/exp/slices"
 10 	"io"
 10 	"strings"
 12 	"errors"
 21 	"log"
 22 	"os"
 30 	"fmt"

Looking at the API, I probably would have used Delete, Insert and Replace for one problem and Clip in at least one place, if I had thought of them. I also know that I've written versions of Compact and Grow a couple of times in the past.

APIs I haven't used and don't remember needing are Compare*, Contains* and IsSorted*, but there's also significant personal preference in that.

One thing I've been missing a couple of times are Map and Filter, but I think it makes sense to delay those until we solve iteration.

Overall, I find myself reaching for slices quite frequently and I use most of its API occasionally, with a decent bias towards the *Func variants. I think an interesting contrast is golang.org/x/exp/maps, which I don't seem to be using much, if at all.

Merovius avatar Dec 22 '22 13:12 Merovius

Can we also add the in-place Reverse([]T) function? It's a common three-liner.

adonovan avatar Dec 22 '22 14:12 adonovan

We could also produce a set of vet analyzers that can suggest fixes that would safely transform pre-generic code using explicit allocations/loops into calls to these new functions.

adonovan avatar Dec 22 '22 19:12 adonovan

We could also produce a set of vet analyzers that can suggest fixes that would safely transform pre-generic code using explicit allocations/loops into calls to these new functions.

That feels like it could get a bunch of false positives. That could be annoying if they show up in go test runs.

DeedleFake avatar Dec 22 '22 19:12 DeedleFake

We could also produce a set of vet analyzers that can suggest fixes that would safely transform pre-generic code using explicit allocations/loops into calls to these new functions.

That feels like it could get a bunch of false positives. That could be annoying if they show up in go test runs.

Sorry, I meant vet analyzer here as a shorthand for golang.org/x/tools/go/analysis analyzers, along with a standalone tool that one could apply as desired to your codebase; I didn't mean they would actually be in the vet tool.

adonovan avatar Dec 22 '22 19:12 adonovan

I would have liked to hear more experience reports from people who have used this package. But maybe the silence just means they're happy so far.

I think the slices package has chosen the right focus. Go slices have quirks; it's really nice to have the operations slices provides, with implementation that is plainly tuned to (in-)variants particular to the slice data structure.

Eventually, I think it's really natural to want backed-by-slices, generics-oriented versions of what e.g. container/heap, container/ring do, or stacks/queues, or multi-dimensional slices, etc., etc., but it feels right that slices doesn't get into this.

AndrewHarrisSPU avatar Dec 22 '22 21:12 AndrewHarrisSPU

This is good but what of a C++ STL equivalent, but with chain-able methods for functional programming feel

Aceix avatar Dec 22 '22 21:12 Aceix

The one function I find myself still regularly missing is the equivalent of maps.DeleteFunc. I understand why slices.Filter isn’t in the package (yet?), but an in place filter would be helpful.

earthboundkid avatar Dec 22 '22 21:12 earthboundkid

I think it will be best if suggestions for new functions in the slices package go into separate proposals. Whether or not we add new functions to the slices package does not affect whether the slices package will go into the standard library. Similarly, whether or not the slices package goes into the standard library does not affect whether we add new functions. Thanks.

ianlancetaylor avatar Dec 22 '22 22:12 ianlancetaylor

Usage statistics per function would be interesting. If there's a function that isn't used at all, that could be a sign that it's not necessary.

@fzipp, scanning the latest module of all publicly known modules on 2022-10-01, the usage stats are as follows:

Function Count
Contains 1481
Clone 531
Sort 433
Equal 404
SortFunc 336
Index 190
IndexFunc 166
Delete 145
Insert 82
BinarySearch 60
SortStableFunc 42
EqualFunc 41
Clip 38
Compact 37
IsSorted 36
Grow 29
BinarySearchFunc 26
Compare 20
IsSortedFunc 18
Replace 10
CompactFunc 5
CompareFunc 3
ContainsFunc 1

I find it ironic that the most used function is Contains, while the least used function is it's sibling ConrtainsFunc.

dsnet avatar Dec 22 '22 23:12 dsnet

ContainsFunc was only added to the package on December 5, so it's not that surprising that it's not used very often.

ianlancetaylor avatar Dec 22 '22 23:12 ianlancetaylor

Replace seems borderline underused for inclusion here.

hherman1 avatar Dec 23 '22 01:12 hherman1

Replace is also fairly new: https://github.com/golang/exp/commit/a1e5550cf13efb62a5f12f594abb7a7f238b38a6

DeedleFake avatar Dec 23 '22 02:12 DeedleFake

ContainsFunc was only added to the package on December 5, so it's not that surprising that it's not used very often.

Yeah, I was about to say. A bunch of my code defines some version of ContainsFunc in terms of IndexFunc. For example, I recently defined

func contains(ctx []Context, s string) bool {
	return slices.IndexFunc(ctx, func(c Context) bool { return c.Name == s }) >= 0
}

I would bet that a large fraction of IndexFunc usages are just checking for the presence of a non-negative index, and could be replaced with ContainsFunc.

smasher164 avatar Dec 23 '22 06:12 smasher164

The other major function on slices I use is lo.Map from this package: https://pkg.go.dev/github.com/samber/lo#Map. It might be too late to request this be added here, but every time I pull in the lo dependency, it's basically so I can call Map over some slice. For example, one of my String() implementations uses Map to construct it

return "<" + strings.Join(lo.Map(t, func(f TyField, i int) string { return f.DeBruijnString() }), ", ") + ">"

smasher164 avatar Dec 23 '22 06:12 smasher164

I think we should delay inclusion of anything similar to Map until we decide on user-defined iteration using range.

DmitriyMV avatar Dec 23 '22 10:12 DmitriyMV

any chance of squeezing this in https://github.com/golang/exp/pull/53

@alandonovan @ianlancetaylor

TonyDMorris avatar Dec 23 '22 11:12 TonyDMorris