rspec-benchmark Failure message appears to contradict itself

trafficstars

Describe the problem

When executing a spec using the power matcher I receive an error that appears to be contradictory.

Steps to reproduce the problem

Create a spec that uses the perform_power matcher.

Your code here to reproduce the issue

      it "tests complexity" do
        expect{request}.to perform_power
      end

Actual behaviour

What happened? This could be a description, log output, error raised etc.

     Failure/Error: expect{request}.to perform_power
       expected block to perform power, but performed power

Expected behaviour

What did you expect to happen? A passing test or a failing test stating the request did not perform power

Describe your environment

OS version: macOS Big Sur 11.6
Ruby version: 2.7.4
RSpec::Benchmark version: rspec-benchmark (0.6.0)

Oct 22 '21 17:10 davidimoore

Hi David,

Thank you for using rspec-benchmark and reporting this issue.

Would you be able to provide a minimal reproduction test case?

May 15 '22 20:05 piotrmurach

Ok, I've spent some time investigating the reasons behind this nonsensical error message.

When assessing whether the expectation matches, two things are taken into account:

the fitness type e.i. logarithmic
the quality of the fit e.i. threshold - how well does the function approximate the observed trend.

A fit quality threshold is a number between 0 and 1. Values above 0.9 mean that the fit is very good which is the default. This value can be changed globally or per test. For example, to lower it to 0.8 you can do:

it "tests complexity" do
  expect { request }.to perform_power.threshold(0.8)
end

So the message expected block to perform power, but performed power means that the fit quality was below the 0.9.

Why is this even taken into account? A poor fit quality means that the range of values is hard to approximate to any trend line and the given complexity is only the best estimate. It would be better to 'improve' the test to get a more definite approximation and thus gain confidence about the measured complexity.

Now, this is not ideal, and can be resolved in two ways:

Expand the error message to include fit quality. For example,

expected block to perform power above 0.9 fit quality, but performed power at 0.87 fit quality

Removed the threshold from the equation and only compare the trend line.

I'm reluctant to go the route of removing the threshold because such tests may become very brittle. With low threshold values, the trend line can be hard to estimate and change with each test run. We want high confidence. Hence, I'm more inclined to improve the message and 'educate' about this parameter. Any thoughts?

Jun 05 '22 11:06 piotrmurach

rspec-benchmark rspec-benchmark copied to clipboard

Failure message appears to contradict itself

Describe the problem

Steps to reproduce the problem

Actual behaviour

Expected behaviour

Describe your environment

rspec-benchmark
rspec-benchmark copied to clipboard