cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Optimize copy.deepcopy by converting a generator expression into list comprehension - 1.04x improvement in pyperformance

Open heikkitoivonen opened this issue 1 week ago • 2 comments

Feature or enhancement

Proposal:

There is a generator expression that is created for copy.deepcopy and immediately fully consumed with all the values needed simultaneously, so the generator provides no value. Converting this to list comprehension makes the code faster. pyperformance shows 1.04x improvement.

     if deep and args:
-        args = (deepcopy(arg, memo) for arg in args)
+        args = [deepcopy(arg, memo) for arg in args]
     y = func(*args)

pyperformance comparison:

deepcopy_baseline.json
======================

Performance version: 1.13.0
Python version: 3.15.0a3+ (64-bit) revision 4f9a8d075ee
Report on macOS-14.6.1-arm64-arm-64bit-Mach-O
Number of logical CPUs: 8
Start date: 2026-01-05 10:59:45.355502
End date: 2026-01-05 11:15:48.467921

deepcopy_optimized.json
=======================

Performance version: 1.13.0
Python version: 3.15.0a3+ (64-bit) revision 4f9a8d075ee
Report on macOS-14.6.1-arm64-arm-64bit-Mach-O
Number of logical CPUs: 8
Start date: 2026-01-05 11:26:34.289860
End date: 2026-01-05 11:42:13.727498

### deepcopy ###
Mean +- std dev: 411 us +- 2 us -> 396 us +- 3 us: 1.04x faster
Significant (t=28.94)

### deepcopy_memo ###
Mean +- std dev: 49.7 us +- 0.4 us -> 49.7 us +- 0.4 us: 1.00x faster
Not significant

### deepcopy_reduce ###
Mean +- std dev: 4.38 us +- 0.05 us -> 4.23 us +- 0.04 us: 1.04x faster
Significant (t=20.05)

I will make a PR for this.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

  • gh-143449

heikkitoivonen avatar Jan 05 '26 19:01 heikkitoivonen

While it can be an improvement in macro benchmarks, what about small copies? do we also have some changes with the JIT vs non-JIT? (and what about free-threading)? cc @eendebakpt

Note: we should also estimate the memory bump for that one. Can this grow a lot? Also, could you show us some micro benchmarks where we have deeply nested objects (or just plain JSON objects that you would call deepcopy on). TiA!

picnixz avatar Jan 05 '26 21:01 picnixz

Oh nvm it's actually function arguments. I guess args is a small list?

picnixz avatar Jan 05 '26 22:01 picnixz