peakperf issues

Wrong compute architecture is being detected during build

Hello! I noticed the following during build: `./build.sh` ... ```cpp -- The CXX compiler identification is GNU 13.2.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info...

wallentx

[GPU] cudaSetDevice not used when -g is specified

Dr-Noob

bug

Enhance CMakeLists.txt to Support CUDA Detection in Non-Default Paths

This PR augments the CMakeLists.txt to enable detection of CUDA libraries and compiler in locations other than their default installation paths. This is especially beneficial for setups where CUDA is...

stefano-corda

[CPU] Support for non-AVX variantes

There are many uarchs (e.g., Kaby Lake) that support AVX in the majority of CPUs but not all (e.g., celeron), but peakperf currently assumes that they all support AVX.

Dr-Noob

enhancement

[GPU] Support for Ampere GPUs

The table in https://github.com/Dr-Noob/peakperf#62-gpu also needs to be updated with proper information.

Dr-Noob

[GPU] Support for RT cores

Same as tensor cores, but with RT cores. Not sure if this RT cores will provide more performance than tensor cores, tough.

Dr-Noob

enhancement

[GPU] Support for tensor cores

1. Detect uarch and deduce if the GPU has tensor cores or not 2. Run a GeMM (how?) using tensor cores to achieve the peak performance in half precision

Dr-Noob

enhancement

Running FLOPS in Windows

14

I've tried running FLOPS in Windows: First, one have to change some int and long to stdint's type (`int32_t` and `int64_t`). After that, I tried running it and the performance...

Dr-Noob

enhancement

Hybrid mode

Run peakperf in CPU and GPU at the same time: device == DEVICE_TYPE_HYBRID ``` Nº Time(s) TFLOP/s (CPU + GPU) 1 2.50984 4.300 (500 + 3800) 2 2.50898 4.310 (500...

Dr-Noob

enhancement

Support for ARM?

2

Dr-Noob

enhancement

peakperf
peakperf copied to clipboard

Metadata

Wrong compute architecture is being detected during build

[GPU] cudaSetDevice not used when -g is specified

Enhance CMakeLists.txt to Support CUDA Detection in Non-Default Paths

[CPU] Support for non-AVX variantes

[GPU] Support for Ampere GPUs

[GPU] Support for RT cores

[GPU] Support for tensor cores

Running FLOPS in Windows

Hybrid mode

Support for ARM?

← Metadata

Owner

Metadata

peakperf peakperf copied to clipboard

Metadata

← Metadata

Owner

Metadata

peakperf
peakperf copied to clipboard