peakperf icon indicating copy to clipboard operation
peakperf copied to clipboard

Achieve peak performance on x86 CPUs and NVIDIA GPUs

Results 16 peakperf issues
Sort by recently updated
recently updated
newest added

Hello! I noticed the following during build: `./build.sh` ... ```cpp -- The CXX compiler identification is GNU 13.2.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info...

This PR augments the CMakeLists.txt to enable detection of CUDA libraries and compiler in locations other than their default installation paths. This is especially beneficial for setups where CUDA is...

There are many uarchs (e.g., Kaby Lake) that support AVX in the majority of CPUs but not all (e.g., celeron), but peakperf currently assumes that they all support AVX.

enhancement

The table in https://github.com/Dr-Noob/peakperf#62-gpu also needs to be updated with proper information.

Same as tensor cores, but with RT cores. Not sure if this RT cores will provide more performance than tensor cores, tough.

enhancement

1. Detect uarch and deduce if the GPU has tensor cores or not 2. Run a GeMM (how?) using tensor cores to achieve the peak performance in half precision

enhancement

I've tried running FLOPS in Windows: First, one have to change some int and long to stdint's type (`int32_t` and `int64_t`). After that, I tried running it and the performance...

enhancement

Run peakperf in CPU and GPU at the same time: device == DEVICE_TYPE_HYBRID ``` Nº Time(s) TFLOP/s (CPU + GPU) 1 2.50984 4.300 (500 + 3800) 2 2.50898 4.310 (500...

enhancement