nvbench icon indicating copy to clipboard operation
nvbench copied to clipboard

CUDA Kernel Benchmarking Library

Results 99 nvbench issues
Sort by recently updated
recently updated
newest added

We are planning on using nvbench in our bazel monorepo. This will involve writing bazel build rules. It would be awesome if nvbench had the build rules in the repository...

I'm fairly new to arena of profiling CUDA kernels and would like to learn more about the basic output metrics of this library. Specifically, looking at the output of ```nvbench.example.throughput```:...

type: enhancement
area: docs

The recent switch to lazy loading by default in CTK 12.2 seems to have broken the async benchmarks. This can be reproduced by `nvbench.example.axes`. The deadlock can be fixed by...

type: bug: functional

I stumbled across [this Slack thread](https://nvidia.slack.com/archives/C01Q5NC7NT0/p1658214334637459) recently when I was trying to measure small kernels with nvbench and got fluctuating results. As @senior-zero notes in the thread, the variance of...

Is there any interest in tracking the results from `nvbench`? I'm considering adding an adapter for `nvbench` to my continuous benchmarking tool, Bencher: https://github.com/bencherdev/bencher And I figured I should check...

When using this project as a with `CUDAHOSTCXX=g++-11` and `CXX=clang++-17` as dependency, I'm getting the following compilation error. ``` FAILED: _deps/nvbench-build/nvbench/CMakeFiles/nvbench.dir/blocking_kernel.cu.o ccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/lib/ccache/g++ -Dnvbench_EXPORTS -I/home/stephan/projects/proviz-app-samples/debug/_deps/nvbench-src -I/home/stephan/projects/proviz-app-samples/debug/_deps/nvbench-build -I/home/stephan/projects/proviz-app-samples/debug/_deps/fmt-src/include -I/home/stephan/projects/proviz-app-samples/debug/_deps/nvbench-build/nvbench/detail...

It would be great if there was a way to say a benchmark is cuda graph compatible and have nvbench automatically record and replay the graph as a column for...

This PR fixes a minor issue that may occur when `nvbench` is run on multiple GPUs without a user-provided cuda stream. ## The issue The error that I observed in...

There are tags for `"nv/cold/time/gpu/stdev/relative"` and `"nv/cold/time/gpu/mean"`, but is there a way to record all iterations in the json output? This is an [option ](https://pytest-benchmark.readthedocs.io/en/latest/usage.html) to record all data in...