pyperformance icon indicating copy to clipboard operation
pyperformance copied to clipboard

Different number for values for python_startup and python_startup_no_site between CPython 2.7 and PyPy 2.7

Open pya opened this issue 7 years ago • 1 comments

The problem

I tried to compare the performance results between different Python versions and implementations. While the comparison between CPython 3.6 and CPython 2.7 works as expected, I get an exception when comparing the results obtained with CPython 2.7.13 and PyP 2.7.13.

Exact versions:

CPython:

 Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:05:08) 
 [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin

PyPy:

 (Python 2.7.13 (c925e7381036, Jun 05 2017, 20:53:58) 
 [PyPy 5.8.0 with GCC 4.2.1 Compatible  Apple LLVM 5.1 (clang-503.0.40)]

How to reproduce the issue

Run from CPython 2 environment:

 python -m  performance run  -o py27.json

Run from PyPy environment:

pypy -m performance run -o pypy27.json

Compare:

pyperformance compare -O table py27.json pypy27.json 

What you expected to happen

I expected table like this: +-------------------------+-----------+-------------+-----------------+-------------------------+ | Benchmark | py27.json | pypy27.json | Change | Significance | +=========================+===========+=============+=================+=========================+ | 2to3 | 767 ms | 1.63 sec | 2.13x slower | Significant (t=-142.45) | +-------------------------+-----------+-------------+-----------------+-------------------------+ | chaos | 215 ms | 5.58 ms | 38.62x faster | Significant (t=204.35) | +-------------------------+-----------+-------------+-----------------+-------------------------+

What actually happens

I get this exception:

compare.py", line 212, in __init__
    raise RuntimeError("base and changed don't have "
 RuntimeError: base and changed don't have the same number of values

Note: The line number my have change do to my debug prints.

Cause

The number of values for the benchmarks python_startup and python_startup_no_site are different, i.e. 200 for CPython and 60 for PyPy (same numbers for both benchmarks)

My work around

I just skipped python_startup and python_startup_no_site with:

        if name in ('python_startup', 'python_startup_no_site'):
            continue

in compare.compare_results:

def compare_results(options):
   base_label, changed_label = get_labels(options.baseline_filename,
                                           options.changed_filename)

    base_suite = perf.BenchmarkSuite.load(options.baseline_filename)
    changed_suite = perf.BenchmarkSuite.load(options.changed_filename)

    results = []
    common = set(base_suite.get_benchmark_names()) & set(
        changed_suite.get_benchmark_names())
    for name in sorted(common):
        print(name)
        if name in ('python_startup', 'python_startup_no_site'):
            continue
        base_bench = base_suite.get_benchmark(name)
        changed_bench = changed_suite.get_benchmark(name)
        result = BenchmarkResult(base_bench, changed_bench)
        results.append(result)

Suggested better solution

Either:

  1. Allow command line argument to explicitly skip comparison of tests.
  2. Skip non-comparable tests automatically and just list them at the end. Make this optional via a command line switch.

pya avatar Sep 25 '17 03:09 pya

You should be able to use "python3 -m perf compare_to ref.json patch.json" command to compare your two benchmark results.

In perf, a warning is emitted if perf cannot check if the difference is significant. It doesn't crash. We should probably do the same in performance. Or maybe rewrite performance using perf compare?

vstinner avatar Sep 28 '17 21:09 vstinner