Abstract

I propose to allow the max and min procs to take an additional argument, which specifies the criterion by which the comparisons are made. This could either be a single-argument procedure, or an it expression (similar to mapIt and friends).

Motivation

It is fairly common to want to get the maximum and minimum of a sequence of T, but by a different criterion than the usual < defined for T (or where < isn't even defined for T). Examples:

Get the string with the longest length
Get the number with the largest absolute value
Get the vector with the largest magnitude
Get the file in a directory with the most recent modified time

The algorithm module in the standard library already provides an analogous functionality for the sort family of functions: you can provide a cmp argument, or use sortedByIt. I suppose a third option could be to allow max and min to take a cmp function, although personally I find the cmp style to be fiddly in this context.

Other languages (e.g. Python) also allow you to pass a function to max and min.

Description

The max implemention in proc style:

proc max[T, U](x: openArray[T], op: proc(t: T): U): T =
  var currentMax = op(x[0])
  result = x[0]
  for i in 1..high(x):
    let criterion = op(x[i])
    if currentMax < criterion:
      currentMax = criterion
      result = x[i]

In it style:

# untested, adapted from sortedByIt implementation
func max[T](x: open_array[T], cmp: proc(x, y: T): int {.closure.}): T =
  result = x[0]
  for i in 1..high(x):
    if cmp(x[i], result) > 0:
      result = x[i]

template maxIt(x, op: untyped): untyped =
  var result = max(x, proc(x, y: typeof(items(x), typeOfIter)): int =
    var it {.inject.} = x
    let a = op
    it = y
    let b = op
    result = cmp(a, b))
  result

proc or `it`?

I'm not sure whether the criterion should be specified as a proc, or in it style, or whether to have both coexisting like is done with map and mapIt. Personally I prefer the proc style, but either or both are fine and have precedent. Thoughts?

Examples

Procedure way:

import sugar, sequtils

let raven = @["Once", "upon", "a", "midnight", "dreary"]
echo raven.max(x => x.len) # "midnight"
echo raven.min(x => x.len) # "a"

it way:

echo raven.maxIt(it.len) # "midnight"
echo raven.minIt(it.len) # "a"

Also the two-argument forms (i.e. not on an openArray) would allow for a criterion argument:

max(-10, 4, x => x.abs) # -10
min(-10, 4, x => x.abs) # 4

Other Things

While I'm on the subject, I thought of some other existing procs that might benefit from taking a criterion argument, and some new procs that could be added to sequtils on the same theme.

The maxIndex and minIndex procs in sequtils are the obvious next step (it versions omitted for brevity):

echo raven.maxIndex(x => x.len) # 3
echo raven.minIndex(x => x.len) # 2

(On an unrelated note, could we also add create argmax and argmin as aliases for maxIndex and minIndex? I thought the stdlib didn't have them for ages, until I noticed maxIndex and minIndex by accident while reading the sequtils docs.)

The find function could have variants which take a predicate argument:

echo raven.find(x => x.len < 4) # 2

It also seems useful to have first and last procs in sequtils:

echo raven.first # "Once"
echo raven.first(x => x.len > 4) # "midnight"
echo raven.last # "dreary"
echo raven.last(x => x.len == 4) # "upon"

(The return type should probably be Option[T] because the predicate might never be satisfied.)

Backward Incompatibility

This should not cause any backward-incompatible changes, since you would still be able to use max and min in the same way as before just by not passing in the criterion argument. These would be defined as additional versions of max and min, with different argument signatures, not modifying the existing definitions.

Dec 07 '21 19:12 ckp95

As long as these additions are not done to system.nim, I have no real objections. However, here is an idea: Instead of adding more and more things to sequtils and algorithm, let's have algorithms in their own modules:

sort.nim, map.nim, fold.nim, minmax.nim, ...

(Related to Nim v2.)

Dec 08 '21 06:12 Araq

you can already do

type AbsInt = distinct int
proc `<=`(a, b: AbsInt): bool = abs(int(a)) <= abs(int(b))
echo int(min(AbsInt(-10), AbsInt(-12)))

Dec 08 '21 15:12 RSDuck

I like this very much. Recently while just playing around with something I did data.sortedByIt(criterion it)[0] (when I needed just data.minByIt(criterion it)) because I was lazy.

Dec 08 '21 16:12 MichalMarsalek

maxIt shouldn't create a closure, it should inline everything, that's the main draw of mapIt/foldIt and friends.

Regarding Araq proposal I would classify like this:

higher order functions (map, fold)
aggregate functions (min, max, sum, argmin, argmax, prod, ...)
...

Jan 17 '22 20:01 mratsim

I want this.

Apr 18 '23 12:04 yojiyama7

I forgot I made this RFC because I haven't used Nim for a while. But I see that Nim 2 is out now so I'll take another look at it and try to make a PR for these functions soon(tm).

you can already do

type AbsInt = distinct int
proc `<=`(a, b: AbsInt): bool = abs(int(a)) <= abs(int(b))
echo int(min(AbsInt(-10), AbsInt(-12)))

Yeah I guess, but having to define a new type and a < proc is a hassle. It's overkill when all I usually want is to make an ad-hoc / one-off criterion. Why do it in three lines when you could do it in one?

Apr 18 '23 12:04 ckp95

Allow max/min to take a proc or `it` argument

Abstract

Motivation

Description

proc or it?

Examples

Other Things

Backward Incompatibility

proc or `it`?