pyperformance icon indicating copy to clipboard operation
pyperformance copied to clipboard

Add additional gc benchmark with pickletools (#437)

Open pgdr opened this issue 2 months ago • 7 comments

Adds a benchmark reproducing the Python 3.14 garbage collector regression described in cpython/#140175.

This real-world case uses pickletools to demonstrate the performance issue.

Fixes #437.

pgdr avatar Nov 03 '25 11:11 pgdr

Python version Running time (sec)
3.13 1.59
3.14 6.47
3.15a 1.55

These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.

I could reduce the size of the instance to lower the overall running time, but it seems like the garbage collector bug doesn't "kick in" until we reach a certain size.

With N = 100'000, the slowdown is not as noticable:

Python version Running time (ms)
3.13 162
3.14 197
3.15a 154

pgdr avatar Nov 03 '25 13:11 pgdr

@sergey-miryanov Thanks for the review. I have fixed all issues you pointed out.

pgdr avatar Nov 05 '25 15:11 pgdr

@sergey-miryanov Something strange happens here. Even though I use the context manager (tempfile.TemporaryDirectory), occasionally when I kill pyperformance, the directory remains not cleaned up.

I am not able to reproduce this behavior when not running with pyperf, though, so it might be related to the way pyperf sets up (parallel?) runners.

It sounds like a bug, but I can't tell where.

pgdr avatar Nov 05 '25 19:11 pgdr

@pgdr Thanks! It is up to pyperformance maintainers now.

sergey-miryanov avatar Nov 05 '25 19:11 sergey-miryanov

These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.

Taking 10 minutes would be too long. However, it only takes about 6 seconds for me to run this on Python 3.14.0, on my hardware. Perhaps the 10 minutes is for when N = 10e6? The regression I see from 3.13 to 3.14 with N = 1e6 seems large enough (1.5 seconds vs 6 seconds, roughly).

Nice work on this benchmark. I think it's good because optimize() is doing some meaningful work, unlike some other synthetic benchmarks. In addition to showing this regression in the GC, I would expect this benchmark to catch other kinds of performance regressions.

nascheme avatar Nov 12 '25 19:11 nascheme

Small suggestion: it would be simpler to use io.BytesIO() rather than using real files in a temporary folder. I don't think that affects the usefulness of the benchmark, since we are not really testing real file IO speed. Something like this:

def setup(fp, N):
    x = {}
    for i in range(1, N):
        x[i] = f"ii{i:>07}"
    pickle.dump(x, fp, protocol=4)

def run(fp):
    p = fp.read()
    s = pickletools.optimize(p)

You could use dumps() as well and do away with the file.

nascheme avatar Nov 12 '25 19:11 nascheme

@nascheme Thanks a lot, that saved a whole bunch of complexity. Running some tests and then I'll fix it. Something like this:

import pickle
import pickletools
import pyperf


def setup(N: int) -> bytes:
    x = {i: f"ii{i:>07}" for i in range(N)}
    return pickle.dumps(x, protocol=4)


def run(p: bytes) -> None:
    pickletools.optimize(p)


if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.metadata["description"] = "Pickletools optimize"
    N = 100_000
    payload = setup(N)
    runner.bench_func("pickle_opt", run, payload)

pgdr avatar Nov 12 '25 20:11 pgdr