cpython
cpython copied to clipboard
gh-125916: Allow functools.reduces 'initial' to be a keyword argument
Before:
from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
TypeError: reduce() takes no keyword arguments
After:
from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
1
Issue: gh-125916
📚 Documentation preview 📚: https://cpython-previews--125917.org.readthedocs.build/
CC @sobolevn, as you have added AC comment.
I did some benchmarks.
# a.py
import pyperf
from functools import reduce
f = lambda x, y: x + y
lst = list(range(10))
init = 123
runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init)
Run e.g. with: python a.py -q -o ref.json.
with results:
| Benchmark | ref | patch |
|---|---|---|
| reduce(f, lst) | 2.18 us | 2.42 us: 1.11x slower |
| reduce(f, lst, init) | 2.35 us | 2.64 us: 1.12x slower |
| Geometric mean | (ref) | 1.12x slower |
Now with AC (patch2).
Patch with AC (a draft).
diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..8faa8ad1ac 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,30 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
/* reduce (used to be a builtin) ********************************************/
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+ function as func: object
+ iterable as seq: object
+ /
+ initial as result: object(c_default="NULL") = None
+
+Apply a function of two arguments cumulatively.
+
+Apply it to the items of a sequence or iterable, from left to right, so as to
+reduce the iterable to a single value. For example, reduce(lambda x, y: x+y,
+[1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is
+placed before the items of the iterable in the calculation, and serves as a
+default when the iterable is empty.
+[clinic start generated code]*/
+
static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+ PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=b7082b8b1473fdc2]*/
{
- PyObject *seq, *func, *result = NULL, *it;
+ PyObject *args, *it;
- if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
- return NULL;
if (result != NULL)
Py_INCREF(result);
@@ -1006,16 +1021,6 @@ functools_reduce(PyObject *self, PyObject *args)
return NULL;
}
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5). If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
/* lru_cache object **********************************************************/
/* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1725,7 @@ PyDoc_STRVAR(_functools_doc,
"Tools that operate on functions.");
static PyMethodDef _functools_methods[] = {
- {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc},
+ _FUNCTOOLS_REDUCE_METHODDEF
_FUNCTOOLS_CMP_TO_KEY_METHODDEF
{NULL, NULL} /* sentinel */
};
You should run ./python Tools/clinic/clinic.py Modules/_functoolsmodule.c to update autogenerated code.
I did some benchmarks.
# a.py
import pyperf
from functools import reduce
f = lambda x, y: x + y
lst = list(range(10))
init = 123
runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init)
Run e.g. with: python a.py -q -o ref.json.
with results:
| Benchmark | ref | patch | patch2 |
|---|---|---|---|
| reduce(f, lst) | 2.18 us | 2.42 us: 1.11x slower | 2.11 us: 1.03x faster |
| reduce(f, lst, init) | 2.35 us | 2.64 us: 1.12x slower | 2.27 us: 1.04x faster |
| Geometric mean | (ref) | 1.12x slower | 1.03x faster |
Looks the patch with AC even slightly faster than in the main.
@skirpichev Is initial=None safe for backward compatibility? Does this mean reduce(Callable[[None, T], None], Iterable[T], None) will behave differently in 3.13 and 3.14?
@skirpichev Is
initial=Nonesafe for backward compatibility? Does this meanreduce(Callable[[None, T], None], Iterable[T], None)will behave differently in 3.13 and 3.14?
No, it doesn't. Someone can use None as initial value.
it should be done for all parameters, not just the initial
That might slowdown the patch v2.
No, it doesn't. Someone can use None as initial value.
That was a draft;) I think we could use same trick as for the Python version.
BTW, it seems the PEP 661 doesn't cover this at all.
Edit:
Updated AC patch with a sentinel value.
diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..00b4a5e6cc 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,31 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
/* reduce (used to be a builtin) ********************************************/
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+ function as func: object
+ iterable as seq: object
+ /
+ initial as result: object(c_default="NULL") = _functools._initial_missing
+
+Apply a function of two arguments cumulatively to an iterable, from left to right.
+
+This efficiently reduce the iterable to a single value. If initial is present,
+it is placed before the items of the iterable in the calculation, and serves as
+a default when the iterable is empty.
+
+For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
+calculates ((((1+2)+3)+4)+5).
+[clinic start generated code]*/
+
static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+ PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=40be8069bcbc1a75]*/
{
- PyObject *seq, *func, *result = NULL, *it;
+ PyObject *args, *it;
- if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
- return NULL;
if (result != NULL)
Py_INCREF(result);
@@ -1006,16 +1022,6 @@ functools_reduce(PyObject *self, PyObject *args)
return NULL;
}
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5). If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
/* lru_cache object **********************************************************/
/* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1726,7 @@ PyDoc_STRVAR(_functools_doc,
"Tools that operate on functions.");
static PyMethodDef _functools_methods[] = {
- {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc},
+ _FUNCTOOLS_REDUCE_METHODDEF
_FUNCTOOLS_CMP_TO_KEY_METHODDEF
{NULL, NULL} /* sentinel */
};
@@ -1789,6 +1795,10 @@ _functools_exec(PyObject *module)
// lru_list_elem is used only in _lru_cache_wrapper.
// So we don't expose it in module namespace.
+ if (PyModule_Add(module, "_initial_missing", _PyObject_New(&PyBaseObject_Type)) < 0) {
+ return -1;
+ }
+
return 0;
}
@skirpichev should the default be specified at all? I think reduce(function, iterable, /, initial) is closer representation of internal working than reduce(function, iterable, /, initial=None). Or is there some sort of a convention?
EDIT: Ah scratch that. That makes initial required. I meant reduce(function, iterable, /[, initial])
I think reduce(function, iterable, /, initial) is closer representation
No. Current code in the main more accurately can be described as function with multiple signatures. Funny notation reduce(function, iterable[, initial], /) means it's possible to have two signature:
reduce(function, iterable, /)
reduce(function, iterable, initial, /)
The AC can't represent multiple signatures yet. The only way - using the sentinel value _initial_missing, like pure-Python version does. See updated patch above. You shouldn't use None as default value.
Do you update this test?
https://github.com/python/cpython/blob/ad6110a93ffa82cae71af6c78692de065d3871b5/Lib/test/test_inspect/test_inspect.py#L5699-L5701
Yes, tests should be adjusted. @sayandipdutta, CI must be green! (FYI: https://devguide.python.org/getting-started/pull-request-lifecycle/#making-good-prs)
I've tried to implement @Eclips4 suggestion to make all arguments positional-or-keyword, but this seems to be impossible with AC now. Looks as a bug.
I've tried to implement @Eclips4 suggestion to make all arguments positional-or-keyword, but this seems to be impossible with AC now. Looks as a bug.
What's the problem?
Yes, I just pushed the changes, i.e. made it unsupported instead of no_signature. I was trying to look for docs regarding this, is there any? Like what makes a signature unsupported or no_signature?
EDIT: I added an entry in Misc/ACKS following blurb output. But I just noticed Crediting, and seems like I shouldn't have done that preemptively? If that's the case, LMK I will remove it.
Yes, I just pushed the changes, i.e. made it unsupported instead of no_signature.
Looks ok for me.
I was trying to look for docs regarding this, is there any? Like what makes a signature unsupported or no_signature?
You can see _test_module_has_signatures() helper: https://github.com/python/cpython/blob/2705ffa575ff12a27c9ff9d8be086df3071903a4/Lib/test/test_inspect/test_inspect.py#L5542
It tests signature as unsupported (e.g. because the __text_signature__ added - can't be parsed by the inspect module).
What's the problem?
@Eclips4, see https://github.com/python/cpython/issues/125954
Note also, that for pure-Python version in the master we have this declaration:
def reduce(function, sequence, initial=_initial_missing):
while in the sphinx docs we have instead reduce(function, iterable[, initial], /). What naming we can count as a part of the API, sequence or iterable?
For what it's worth I'm not convinced we need to make all parameters pos-and-keyword. I think there's a much stronger use case for passing initial= as a keyword argument, since it's an optional argument and even people who understand what reduce does in general might not immediately recognize it.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.
Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.
For what it's worth I'm not convinced we need to make all parameters pos-and-keyword. I think there's a much stronger use case for passing
initial=as a keyword argument, since it's an optional argument and even people who understand whatreducedoes in general might not immediately recognize it.+1
Moreover, I think that this change should be broken up in two steps:
- Adapt
reduce()to Argument Clinic (including benchmarks)- Allow
initialto be a keyword argumentWe know that when adapting functions and methods to Argument Clinic, it is easy to introduce subtle bugs, so I would really like this to be a separate change.
I support the opinion that having all parameters as pos-and-keyword is not very useful, but if we only make initial pos-and keyword, then we need to decide what to do about Python implementation, which currently treats all parameters like pos-and-keyword arguments. Maybe we should start raising a warning if function and sequence arguments are passed as keyword arguments?
[...] then we need to decide what to do about Python implementation [...]
Differences between functools.py and _functoolsmodule.c is out of scope for this issue/change.
@erlend-aasland as AC changes coming from my patch, probably 1) point - is my job. I think that @sayandipdutta can keep this PR as-is for a while. I'll make a separate PR with AC-related changes.
PR, that switch to AC: https://github.com/python/cpython/pull/125999
Thanks a lot @skirpichev! It seems all I have to do is wait for your PR to be merged and then merge main into my PR. Will do so.
Meta: I'll marked this as draft until Sergey's PR has landed.
Meta: I'll marked this as draft until Sergey's PR has landed.
Sergey's Argument Clinic adaption landed just now. Please resolve conflicts, regenerate clinic, and mark the PR ready for review again.
Can you post updated benchmarks vs. current main?
I have made the requested changes; please review again
Thanks for making the requested changes!
@erlend-aasland: please review the changes made to this pull request.
Checked against debug build. Followed script from https://github.com/python/cpython/pull/125917#issuecomment-2434813259
| call | Main branch | This branch |
|---|---|---|
| reduce(f, lst) | 3.59 us +- 0.13 us | 3.51 us +- 0.14 us |
| reduce(f, lst, initial) | 3.82 us +- 0.25 us | 3.78 us +- 0.13 us |
@erlend-aasland
EDIT: On release:
| call | Main branch | This branch |
|---|---|---|
| reduce(f, lst) | 912 ns +- 51 ns | 915 ns +- 50 ns |
| reduce(f, lst, initial) | 987 ns +- 49 ns | 979 ns +- 34 ns |
BTW, you need a What's New entry for this.
@sayandipdutta, this now has merge conflicts.
I think that most people are ok with adding only one positional-or-keyword parameter (not 3). Is there something to do, besides simple conflict resolution?
@Eclips4, are you ok with this (as your pr depends on the current one)?