cpython gh-125916: Allow functools.reduces 'initial' to be a keyword argument

trafficstars

Before:

from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
TypeError: reduce() takes no keyword arguments

After:

from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
1

Issue: gh-125916

📚 Documentation preview 📚: https://cpython-previews--125917.org.readthedocs.build/

Oct 24 '24 08:10 sayandipdutta

All commit authors signed the Contributor License Agreement.

Oct 24 '24 08:10 ghost

CC @sobolevn, as you have added AC comment.

Oct 24 '24 08:10 skirpichev

I did some benchmarks.

# a.py
import pyperf
from functools import reduce

f = lambda x, y: x + y
lst = list(range(10))
init = 123

runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init)

Run e.g. with: python a.py -q -o ref.json.

with results:

Benchmark	ref	patch
reduce(f, lst)	2.18 us	2.42 us: 1.11x slower
reduce(f, lst, init)	2.35 us	2.64 us: 1.12x slower
Geometric mean	(ref)	1.12x slower

Oct 24 '24 09:10 skirpichev

Now with AC (patch2).

Patch with AC (a draft).

diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..8faa8ad1ac 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,30 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
 
 /* reduce (used to be a builtin) ********************************************/
 
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+    function as func: object
+    iterable as seq: object
+    /
+    initial as result: object(c_default="NULL") = None
+
+Apply a function of two arguments cumulatively.
+
+Apply it to the items of a sequence or iterable, from left to right, so as to
+reduce the iterable to a single value.  For example, reduce(lambda x, y: x+y,
+[1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).  If initial is present, it is
+placed before the items of the iterable in the calculation, and serves as a
+default when the iterable is empty.
+[clinic start generated code]*/
+
 static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+                       PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=b7082b8b1473fdc2]*/
 {
-    PyObject *seq, *func, *result = NULL, *it;
+    PyObject *args, *it;
 
-    if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
-        return NULL;
     if (result != NULL)
         Py_INCREF(result);
 
@@ -1006,16 +1021,6 @@ functools_reduce(PyObject *self, PyObject *args)
     return NULL;
 }
 
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5).  If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
 /* lru_cache object **********************************************************/
 
 /* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1725,7 @@ PyDoc_STRVAR(_functools_doc,
 "Tools that operate on functions.");
 
 static PyMethodDef _functools_methods[] = {
-    {"reduce",          functools_reduce,     METH_VARARGS, functools_reduce_doc},
+    _FUNCTOOLS_REDUCE_METHODDEF
     _FUNCTOOLS_CMP_TO_KEY_METHODDEF
     {NULL,              NULL}           /* sentinel */
 };

You should run ./python Tools/clinic/clinic.py Modules/_functoolsmodule.c to update autogenerated code.

I did some benchmarks.

# a.py
import pyperf
from functools import reduce

f = lambda x, y: x + y
lst = list(range(10))
init = 123

runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init)

Run e.g. with: python a.py -q -o ref.json.

with results:

Benchmark	ref	patch	patch2
reduce(f, lst)	2.18 us	2.42 us: 1.11x slower	2.11 us: 1.03x faster
reduce(f, lst, init)	2.35 us	2.64 us: 1.12x slower	2.27 us: 1.04x faster
Geometric mean	(ref)	1.12x slower	1.03x faster

Looks the patch with AC even slightly faster than in the main.

Oct 24 '24 09:10 skirpichev

@skirpichev Is initial=None safe for backward compatibility? Does this mean reduce(Callable[[None, T], None], Iterable[T], None) will behave differently in 3.13 and 3.14?

Oct 24 '24 10:10 sayandipdutta

@skirpichev Is initial=None safe for backward compatibility? Does this mean reduce(Callable[[None, T], None], Iterable[T], None) will behave differently in 3.13 and 3.14?

No, it doesn't. Someone can use None as initial value.

Oct 24 '24 10:10 Eclips4

it should be done for all parameters, not just the initial

That might slowdown the patch v2.

No, it doesn't. Someone can use None as initial value.

That was a draft;) I think we could use same trick as for the Python version.

BTW, it seems the PEP 661 doesn't cover this at all.

Edit:

Updated AC patch with a sentinel value.

diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..00b4a5e6cc 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,31 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
 
 /* reduce (used to be a builtin) ********************************************/
 
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+    function as func: object
+    iterable as seq: object
+    /
+    initial as result: object(c_default="NULL") = _functools._initial_missing
+
+Apply a function of two arguments cumulatively to an iterable, from left to right.
+
+This efficiently reduce the iterable to a single value.  If initial is present,
+it is placed before the items of the iterable in the calculation, and serves as
+a default when the iterable is empty.
+
+For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
+calculates ((((1+2)+3)+4)+5).
+[clinic start generated code]*/
+
 static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+                       PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=40be8069bcbc1a75]*/
 {
-    PyObject *seq, *func, *result = NULL, *it;
+    PyObject *args, *it;
 
-    if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
-        return NULL;
     if (result != NULL)
         Py_INCREF(result);
 
@@ -1006,16 +1022,6 @@ functools_reduce(PyObject *self, PyObject *args)
     return NULL;
 }
 
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5).  If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
 /* lru_cache object **********************************************************/
 
 /* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1726,7 @@ PyDoc_STRVAR(_functools_doc,
 "Tools that operate on functions.");
 
 static PyMethodDef _functools_methods[] = {
-    {"reduce",          functools_reduce,     METH_VARARGS, functools_reduce_doc},
+    _FUNCTOOLS_REDUCE_METHODDEF
     _FUNCTOOLS_CMP_TO_KEY_METHODDEF
     {NULL,              NULL}           /* sentinel */
 };
@@ -1789,6 +1795,10 @@ _functools_exec(PyObject *module)
     // lru_list_elem is used only in _lru_cache_wrapper.
     // So we don't expose it in module namespace.
 
+    if (PyModule_Add(module, "_initial_missing", _PyObject_New(&PyBaseObject_Type)) < 0) {
+        return -1;
+    }
+
     return 0;
 }

Oct 24 '24 10:10 skirpichev

@skirpichev should the default be specified at all? I think reduce(function, iterable, /, initial) is closer representation of internal working than reduce(function, iterable, /, initial=None). Or is there some sort of a convention?

EDIT: Ah scratch that. That makes initial required. I meant reduce(function, iterable, /[, initial])

Oct 24 '24 11:10 sayandipdutta

I think reduce(function, iterable, /, initial) is closer representation

No. Current code in the main more accurately can be described as function with multiple signatures. Funny notation reduce(function, iterable[, initial], /) means it's possible to have two signature:

reduce(function, iterable, /)
reduce(function, iterable, initial, /)

The AC can't represent multiple signatures yet. The only way - using the sentinel value _initial_missing, like pure-Python version does. See updated patch above. You shouldn't use None as default value.

Oct 24 '24 11:10 skirpichev

Do you update this test?

https://github.com/python/cpython/blob/ad6110a93ffa82cae71af6c78692de065d3871b5/Lib/test/test_inspect/test_inspect.py#L5699-L5701

Oct 24 '24 16:10 nineteendo

Yes, tests should be adjusted. @sayandipdutta, CI must be green! (FYI: https://devguide.python.org/getting-started/pull-request-lifecycle/#making-good-prs)

I've tried to implement @Eclips4 suggestion to make all arguments positional-or-keyword, but this seems to be impossible with AC now. Looks as a bug.

Oct 24 '24 16:10 skirpichev

I've tried to implement @Eclips4 suggestion to make all arguments positional-or-keyword, but this seems to be impossible with AC now. Looks as a bug.

What's the problem?

Oct 24 '24 17:10 Eclips4

Yes, I just pushed the changes, i.e. made it unsupported instead of no_signature. I was trying to look for docs regarding this, is there any? Like what makes a signature unsupported or no_signature?

EDIT: I added an entry in Misc/ACKS following blurb output. But I just noticed Crediting, and seems like I shouldn't have done that preemptively? If that's the case, LMK I will remove it.

Oct 24 '24 17:10 sayandipdutta

Yes, I just pushed the changes, i.e. made it unsupported instead of no_signature.

Looks ok for me.

I was trying to look for docs regarding this, is there any? Like what makes a signature unsupported or no_signature?

You can see _test_module_has_signatures() helper: https://github.com/python/cpython/blob/2705ffa575ff12a27c9ff9d8be086df3071903a4/Lib/test/test_inspect/test_inspect.py#L5542

It tests signature as unsupported (e.g. because the __text_signature__ added - can't be parsed by the inspect module).

What's the problem?

@Eclips4, see https://github.com/python/cpython/issues/125954

Note also, that for pure-Python version in the master we have this declaration:

def reduce(function, sequence, initial=_initial_missing):

while in the sphinx docs we have instead reduce(function, iterable[, initial], /). What naming we can count as a part of the API, sequence or iterable?

Oct 25 '24 04:10 skirpichev

For what it's worth I'm not convinced we need to make all parameters pos-and-keyword. I think there's a much stronger use case for passing initial= as a keyword argument, since it's an optional argument and even people who understand what reduce does in general might not immediately recognize it.

Oct 25 '24 18:10 JelleZijlstra

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Oct 25 '24 19:10 bedevere-app[bot]

For what it's worth I'm not convinced we need to make all parameters pos-and-keyword. I think there's a much stronger use case for passing initial= as a keyword argument, since it's an optional argument and even people who understand what reduce does in general might not immediately recognize it.

+1

Moreover, I think that this change should be broken up in two steps:

Adapt reduce() to Argument Clinic (including benchmarks)

Allow initial to be a keyword argument

We know that when adapting functions and methods to Argument Clinic, it is easy to introduce subtle bugs, so I would really like this to be a separate change.

I support the opinion that having all parameters as pos-and-keyword is not very useful, but if we only make initial pos-and keyword, then we need to decide what to do about Python implementation, which currently treats all parameters like pos-and-keyword arguments. Maybe we should start raising a warning if function and sequence arguments are passed as keyword arguments?

Oct 25 '24 21:10 Eclips4

[...] then we need to decide what to do about Python implementation [...]

Differences between functools.py and _functoolsmodule.c is out of scope for this issue/change.

Oct 25 '24 21:10 erlend-aasland

@erlend-aasland as AC changes coming from my patch, probably 1) point - is my job. I think that @sayandipdutta can keep this PR as-is for a while. I'll make a separate PR with AC-related changes.

Oct 26 '24 01:10 skirpichev

PR, that switch to AC: https://github.com/python/cpython/pull/125999

Oct 26 '24 03:10 skirpichev

Thanks a lot @skirpichev! It seems all I have to do is wait for your PR to be merged and then merge main into my PR. Will do so.

Oct 26 '24 07:10 sayandipdutta

Meta: I'll marked this as draft until Sergey's PR has landed.

Oct 26 '24 20:10 erlend-aasland

Meta: I'll marked this as draft until Sergey's PR has landed.

Sergey's Argument Clinic adaption landed just now. Please resolve conflicts, regenerate clinic, and mark the PR ready for review again.

Nov 01 '24 20:11 erlend-aasland

Can you post updated benchmarks vs. current main?

Nov 01 '24 21:11 erlend-aasland

I have made the requested changes; please review again

Nov 01 '24 22:11 sayandipdutta

Thanks for making the requested changes!

@erlend-aasland: please review the changes made to this pull request.

Nov 01 '24 22:11 bedevere-app[bot]

Checked against debug build. Followed script from https://github.com/python/cpython/pull/125917#issuecomment-2434813259

call	Main branch	This branch
reduce(f, lst)	3.59 us +- 0.13 us	3.51 us +- 0.14 us
reduce(f, lst, initial)	3.82 us +- 0.25 us	3.78 us +- 0.13 us

@erlend-aasland

EDIT: On release:

call	Main branch	This branch
reduce(f, lst)	912 ns +- 51 ns	915 ns +- 50 ns
reduce(f, lst, initial)	987 ns +- 49 ns	979 ns +- 34 ns

Nov 01 '24 22:11 sayandipdutta

BTW, you need a What's New entry for this.

Nov 01 '24 23:11 erlend-aasland

@sayandipdutta, this now has merge conflicts.

Nov 11 '24 02:11 skirpichev

I think that most people are ok with adding only one positional-or-keyword parameter (not 3). Is there something to do, besides simple conflict resolution?

@Eclips4, are you ok with this (as your pr depends on the current one)?

Nov 11 '24 02:11 skirpichev

cpython cpython copied to clipboard

gh-125916: Allow functools.reduces 'initial' to be a keyword argument

Before:

After:

cpython
cpython copied to clipboard