cpython
cpython copied to clipboard
gh-117431: Improve performance of startswith and endswith (version 2)
Improve performance of startswith and endswith by eliminating double work in tailmatch.
Benchmark results:
single-character match: x.startswith('a'): Mean +- std dev: [main_startswith] 80.6 ns +- 0.7 ns -> [v2_startswith] 69.8 ns +- 0.8 ns: 1.15x faster
single-character fail: x.startswith('q'): Mean +- std dev: [main_startswith] 67.8 ns +- 0.6 ns -> [v2_startswith] 68.4 ns +- 0.7 ns: 1.01x slower
two-character match: x.startswith('ab'): Mean +- std dev: [main_startswith] 81.3 ns +- 2.1 ns -> [v2_startswith] 75.9 ns +- 0.6 ns: 1.07x faster
two-character fail head: x.startswith('qb'): Mean +- std dev: [main_startswith] 68.0 ns +- 1.1 ns -> [v2_startswith] 76.0 ns +- 0.9 ns: 1.12x slower
two-character fail tail: x.startswith('aq'): Mean +- std dev: [main_startswith] 73.0 ns +- 3.3 ns -> [v2_startswith] 68.5 ns +- 0.9 ns: 1.07x faster
multi-character match: x.startswith('abcdefghijkl'): Mean +- std dev: [main_startswith] 84.9 ns +- 1.1 ns -> [v2_startswith] 78.7 ns +- 2.3 ns: 1.08x faster
multi-character fail midle: x.startswith('abcdef_hijkl'): Mean +- std dev: [main_startswith] 85.3 ns +- 2.0 ns -> [v2_startswith] 78.6 ns +- 0.8 ns: 1.09x faster
Benchmark hidden because not significant (3): empty: x.startswith(''), multi-character different kind match: xu.startswith('abcdefghijkl'), multi-character different kind fail: xu.startswith('abcdef_hijkl')
Geometric mean: 1.03x faster
Benchmark script
import pyperf
runner = pyperf.Runner()
setup="""
x = 'abcdefghijklmnop'
y = 'abcdefghijklmnop_bbbbbbbbbbbbbbbbbb'
xu = x + '\u1234'
yu = y + '\u1234'
l = 'a' * 1000 + 'b'
x_startswith = x.startswith
"""
# Tested with ./python sw.py --rigorous -o main_startswith.json
if 1:
runner.timeit(name="empty: x.startswith('')", stmt="x.startswith(''); y.startswith('')", setup=setup)
runner.timeit(name="single-character match: x.startswith('a')", stmt="x.startswith('a'); y.startswith('a')", setup=setup)
runner.timeit(name="single-character fail: x.startswith('q')", stmt="x.startswith('q'); y.startswith('q')", setup=setup)
runner.timeit(name="two-character match: x.startswith('ab')", stmt="x.startswith('ab'); y.startswith('ab')", setup=setup)
runner.timeit(name="two-character fail head: x.startswith('qb')", stmt="x.startswith('qb'); y.startswith('qb')", setup=setup)
runner.timeit(name="two-character fail tail: x.startswith('aq')", stmt="x.startswith('aq'); y.startswith('aq')", setup=setup)
runner.timeit(name="multi-character match: x.startswith('abcdefghijkl')", stmt="x.startswith('abcdefghijkl'); y.startswith('abcdefghijkl')", setup=setup)
runner.timeit(name="multi-character fail midle: x.startswith('abcdef_hijkl')", stmt="x.startswith('abcdef_hijkl'); y.startswith('abcdef_hijkl')", setup=setup)
runner.timeit(name="multi-character different kind match: xu.startswith('abcdefghijkl')", stmt="xu.startswith('abcdefghijkl'); yu.startswith('abcdefghijkl')", setup=setup)
runner.timeit(name="multi-character different kind fail: xu.startswith('abcdef_hijkl')", stmt="xu.startswith('abcdefghijkl'); yu.startswith('abcdefghijkl')", setup=setup)
- By first checking the tail of the substring and then the start, we can combine a call to
PyUnicode_READandmemcmp - For the single character case we prevent the check with
PyUnicode_READfrom happening twice. - With this PR the performance of many cases improves, in particular the case where substrings match . The only case where performance is less, is for substrings that fail to match on the start of the substring (but the performance for substrings that fail on the end of the substring improves).
- Also see #117480
- Issue: gh-117431