DeepBench icon indicating copy to clipboard operation
DeepBench copied to clipboard

Benchmarking Deep Learning operations on different hardware

Results 24 DeepBench issues
Sort by recently updated
recently updated
newest added

This pull request upgrades the Arm xGEMM tests with new library bindings to the Arm Performance Library, Arm Compute Library and OpenBLAS library to test and compare with the existing...

Your NVIDIA gemm benchmark appears to have a problem. gemm_bench.cu uses uint16_t, an integer type, instead of __half to represent half precision floating point numbers. As a result, rand() in...

Introduces "mixed" precision benching for f16 * f16 to f32. Also fix the data type used for half from uint16_t to __half as reported by Wayne @wjb in issue 104:...

[DeepBench_NV_V100.xlsx](https://github.com/baidu-research/DeepBench/files/2544026/DeepBench_NV_V100.xlsx)

This pull request upgrades the x86 xGEMM tests with a new library bindingto the OpenBLAS library to test and compare with the existing MKL library. It also upgrades how the...

error corrected opt/rocm/bin/hipcc ./gemm_bench.cpp -o bin/gemm_bench -I./../kernels -lrocblas -O3 -std=c++11 --amdgpu-target=gfx900 ./gemm_bench.cpp:9:10: fatal error: 'rocblas.h' file not found #include

This commit adds /opt/rocm/include to the makefile for AMD targets. The MIOpen and rocBLAS header files can be located under /opt/rocm/include directory.

I am getting the 1/10 flops/s on the AMD Vega architecture as compared to one mentioned in the results folder. Anybody know why ???

Environment: 1. GPU cards: Tesla K80 2. CUDA:8.0 3. cuDNN:5.1 4. OpenMPI:1.10.2 Problems: After make there are five files in .../nvidia/bin , they are: conv_bench gemm_bench nccl_mpi_all_reduce nccl_single_all_reduce rnn_bench And...