Add additional gc benchmark with pickletools (#437)
Adds a benchmark reproducing the Python 3.14 garbage collector regression described in cpython/#140175.
This real-world case uses pickletools to demonstrate the performance issue.
Fixes #437.
| Python version | Running time (sec) |
|---|---|
| 3.13 | 1.59 |
| 3.14 | 6.47 |
| 3.15a | 1.55 |
These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.
I could reduce the size of the instance to lower the overall running time, but it seems like the garbage collector bug doesn't "kick in" until we reach a certain size.
With N = 100'000, the slowdown is not as noticable:
| Python version | Running time (ms) |
|---|---|
| 3.13 | 162 |
| 3.14 | 197 |
| 3.15a | 154 |
@sergey-miryanov Thanks for the review. I have fixed all issues you pointed out.
@sergey-miryanov Something strange happens here. Even though I use the context manager (tempfile.TemporaryDirectory), occasionally when I kill pyperformance, the directory remains not cleaned up.
I am not able to reproduce this behavior when not running with pyperf, though, so it might be related to the way pyperf sets up (parallel?) runners.
It sounds like a bug, but I can't tell where.
@pgdr Thanks! It is up to pyperformance maintainers now.
These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.
Taking 10 minutes would be too long. However, it only takes about 6 seconds for me to run this on Python 3.14.0, on my hardware. Perhaps the 10 minutes is for when N = 10e6? The regression I see from 3.13 to 3.14 with N = 1e6 seems large enough (1.5 seconds vs 6 seconds, roughly).
Nice work on this benchmark. I think it's good because optimize() is doing some meaningful work, unlike some other synthetic benchmarks. In addition to showing this regression in the GC, I would expect this benchmark to catch other kinds of performance regressions.
Small suggestion: it would be simpler to use io.BytesIO() rather than using real files in a temporary folder. I don't think that affects the usefulness of the benchmark, since we are not really testing real file IO speed. Something like this:
def setup(fp, N):
x = {}
for i in range(1, N):
x[i] = f"ii{i:>07}"
pickle.dump(x, fp, protocol=4)
def run(fp):
p = fp.read()
s = pickletools.optimize(p)
You could use dumps() as well and do away with the file.
@nascheme Thanks a lot, that saved a whole bunch of complexity. Running some tests and then I'll fix it. Something like this:
import pickle
import pickletools
import pyperf
def setup(N: int) -> bytes:
x = {i: f"ii{i:>07}" for i in range(N)}
return pickle.dumps(x, protocol=4)
def run(p: bytes) -> None:
pickletools.optimize(p)
if __name__ == "__main__":
runner = pyperf.Runner()
runner.metadata["description"] = "Pickletools optimize"
N = 100_000
payload = setup(N)
runner.bench_func("pickle_opt", run, payload)