benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

[BUG] Setup function running multiple times

Open nirvdrum opened this issue 2 years ago • 4 comments

Describe the bug

According to the user guide, global setup and teardown functions should be once, unless using multiple threads:

The setup/teardown callbacks will be invoked once for each benchmark. If the benchmark is multi-threaded (will run in k threads), they will be invoked exactly once before each run with k threads.

However, I'm seeing the setup and teardown functions are being invoked multiple times as the framework determines its iteration count.

System Which OS, compiler, and compiler version are you using:

  • OS: Ubuntu 22.04
  • Compiler and version: clang++ 14.0.0
  • Benchmark version: 8d86026c67e41b1f74e67c1b20cc8f73871bc76e

To reproduce Steps to reproduce the behavior:

  1. Follow the README to build benchmark
  2. Add a single benchmark using the BENCHMARK macro, registering a setup function with Setup.
  3. Add a print statement to your setup function
  4. Run the benchmark single-threaded and see how often the setup function is called

Expected behavior I'd expect the setup function is called once for each benchmark, not each time as the framework determines its iteration count.

nirvdrum avatar May 04 '22 04:05 nirvdrum

Hi! would you be able to confirm the repetition for your benchmark(s) is == 1 ?

We have test cases covering these exact use cases, and they're still passing so I'm a bit surprised by this behaviour.

Thanks!

oontvoo avatar May 19 '22 05:05 oontvoo

@oontvoo I encounter the same bug of @nirvdrum. I've followed the user guide and set the number of iterations to 1.

BENCHMARK(BM_streamcluster)->Unit(benchmark::kMillisecond)->Iterations(1)->Setup(DoSetup)/*->Teardown(DoTeardown)*/;

The setup callback is executed again after the benchmark ends.

ominil avatar Jun 18 '22 13:06 ominil

I'm not sure that the issue is the same for @nirvdrum, but I discovered that this behaviour happens when I Register a MemoryManager in the custom main.

You can try to replicate this issue as follow:

streamcluster.cpp

#include <memory>
#include <benchmark/benchmark.h>
#include "memory_manager.h"

static void BM_streamcluster(benchmark::State& state) {
    for (auto _ : state) {
        std::cout << "begin benchmark" << std::endl;
//        streamCluster(stream_global, kmin, kmax, dim, chunksize, clustersize, outfilename );
    }
}

BENCHMARK(BM_streamcluster)->Unit(benchmark::kMillisecond)->Iterations(1);


//BENCHMARK_MAIN();
int main(int argc, char** argv)
{
    ::benchmark::RegisterMemoryManager(mm.get());
    ::benchmark::Initialize(&argc, argv);
    ::benchmark::RunSpecifiedBenchmarks();
    ::benchmark::RegisterMemoryManager(nullptr);

    return 0;
}

memory_manager.h

#ifndef INTEL_VECTORIZED_BENCHMARK_SUITE_MEMORY_MANAGER_H
#define INTEL_VECTORIZED_BENCHMARK_SUITE_MEMORY_MANAGER_H

#include <memory>
#include <benchmark/benchmark.h>

class CustomMemoryManager: public benchmark::MemoryManager {
public:

    int64_t num_allocs;
    int64_t max_bytes_used;


    void Start() BENCHMARK_OVERRIDE {
        num_allocs = 0;
        max_bytes_used = 0;
    }

    void Stop(Result* result) BENCHMARK_OVERRIDE {
        result->num_allocs = num_allocs;
        result->max_bytes_used = max_bytes_used;
    }
};

std::unique_ptr<CustomMemoryManager> mm(new CustomMemoryManager());

#ifdef MEMORY_PROFILER
void *custom_malloc(size_t size) {
    void *p = malloc(size);
//    std::cout << "Size: " << size << std::endl;
    mm.get()->num_allocs += 1;
    mm.get()->max_bytes_used += size;
    return p;
}
#define malloc(size) custom_malloc(size)
#endif

#endif //INTEL_VECTORIZED_BENCHMARK_SUITE_MEMORY_MANAGER_H

If I run the following code, it will execute the benchmark two times.

Instead, if I remove the RegisterMemory from custom main:

//BENCHMARK_MAIN();
int main(int argc, char** argv)
{
    // ::benchmark::RegisterMemoryManager(mm.get());
    ::benchmark::Initialize(&argc, argv);
    ::benchmark::RunSpecifiedBenchmarks();
    // ::benchmark::RegisterMemoryManager(nullptr);

    return 0;
}

The benchmark is executed a single time.

To compile I use gcc version 9.4.0 on ubuntu 20.04.01)

g++  streamcluster.cpp  -Wall -pedantic -std=c++11 -O3 -isystem benchmark/include -Lbenchmark/build/src -lbenchmark -lpthread  -o bin/streamcluster_fullvec.exe

ominil avatar Jun 18 '22 13:06 ominil

this is WAI if it is related to memory managers, as we run the benchmarks for those as a separate pass after the main benchmarks run.

dmah42 avatar Jun 20 '22 08:06 dmah42