nvbench issues

Throughput statistics are not calculated when reads/writes are declared after `state.exec()`

The current implementation computes the throughput statistics in `measure_cold`, which is invoked during `state.exec`. This has the undesirable effect that throughput statistics are not generated when reads/writes are declared after...

alliepiper

Restrict stopping criterion parameter usage in command line

3

Now the option parser throws an exception if any parameters don't match corresponding stopping criterions. This PR addresses issue #153 .

psvvsp

devcontainer: replace `VAULT_HOST` with `AWS_ROLE_ARN`

This PR is replacing the `VAULT_HOST` variable with `AWS_ROLE_ARN`. This is required to use the new token service to get AWS credentials.

jjacobelli

Add CLI parameter to fix the run count

1

nvbench is a great tool for generating profiles for libcudf. I've found that the `--profile` option with `--run-once` was a good starting point, but for many operations we need more...

GregoryKimball

Allow comparison of benchmarks for different GPUs

I was trying to compare benchmark results for A100 PCI and A100 SXM, but nvbench refused with: ``` nvbench_compare.py ./babelstream_fallback_blocks_A100_PCI/ ./babelstream_fallback_blocks_A100_SXM/ ['./babelstream_fallback_blocks_A100_PCI/', './babelstream_fallback_blocks_A100_SXM/'] Device sections do not match. ``` I...

bernhardmgruber

Enable CUPTI to measure kernel execution time instead of CUDA Events

1

CUDA events suffer from low accuracy and include the kernel launch overhead. On the other hand, CUPTI provides a more reliable way to get consistent timing measurement. This request asks...

fbusato

Create directories for output JSON files

I have ran into this twice now and thought it would be great if an nvbench-based benchmark could create any intermediate directories for the output JSON file. Now, with the...

bernhardmgruber

Compiler errors building examples with CUDA 11.5

I attempted to build nvbench with the following setup and was faced with compiler errors. - nvbench commit: a171514056e5d6a7f52a035dd6c812fa301d4f4f (latest commit to main) - nvcc: Cuda compilation tools, release 11.5,...

bryanpeele

Running examples causes segmentation fault

I follow the instructions in the readme to build examples "cmake -DNVBench_ENABLE_EXAMPLES=ON -DCMAKE_CUDA_ARCHITECTURES=80 .. && make", but when I run the examples with "./nvbench.example.cpp20.axes" or "./nvbench.example.cpp17.axes", I get the error...

xiupingcui

Extend nvbench to measure SOL for compute-bound workloads

1

Existing nvbench allows to measure SOL for memory bound workloads by providing ``` state.addGlobalMemoryReads(nbytes) state.addGlobalMemoryWrites(nbytes) ``` It would be useful to extend this concept to provide flops such that we...

samaid

nvbench
nvbench copied to clipboard

Metadata

Throughput statistics are not calculated when reads/writes are declared after `state.exec()`

Restrict stopping criterion parameter usage in command line

devcontainer: replace `VAULT_HOST` with `AWS_ROLE_ARN`

Add CLI parameter to fix the run count

Allow comparison of benchmarks for different GPUs

Enable CUPTI to measure kernel execution time instead of CUDA Events

Create directories for output JSON files

Compiler errors building examples with CUDA 11.5

Running examples causes segmentation fault

Extend nvbench to measure SOL for compute-bound workloads

← Metadata

Owner

Metadata

nvbench nvbench copied to clipboard

Metadata

← Metadata

Owner

Metadata

nvbench
nvbench copied to clipboard