clpeak
clpeak copied to clipboard
results for AMD Radeon Pro VII (connected via Thunderbolt 3 as an eGPU) in Windows 11
AMD Radeon Pro VII (Vega 20) Node Titan Thunderbolt 3 eGPU box (with Intel JHL7420 controller)
Platform: AMD Accelerated Parallel Processing
Device: gfx906
Driver version : 3516.0 (PAL,HSAIL) (Win64)
Compute units : 60
Clock frequency : 1700 MHz
Global memory bandwidth (GBPS)
float : 796.86
float2 : 827.75
float4 : 822.74
float8 : 793.45
float16 : 660.62
Single-precision compute (GFLOPS)
float : 12798.81
float2 : 12707.00
float4 : 12880.65
float8 : 12783.70
float16 : 12607.74
Half-precision compute (GFLOPS)
half : 8636.53
half2 : 25210.47
half4 : 24664.82
half8 : 23910.08
half16 : 22193.79
Double-precision compute (GFLOPS)
double : 6455.92
double2 : 6407.21
double4 : 6417.81
double8 : 6388.41
double16 : 6272.83
Integer compute (GIOPS)
int : 4274.61
int2 : 4189.85
int4 : 4215.42
int8 : 4187.77
int16 : 4194.81
Integer compute Fast 24bit (GIOPS)
int : 12203.69
int2 : 11403.85
int4 : 11338.95
int8 : 11021.97
int16 : 10849.96
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 31.31
enqueueReadBuffer : 31.82
enqueueWriteBuffer non-blocking : 31.15
enqueueReadBuffer non-blocking : 31.92
enqueueMapBuffer(for read) : 810371.12
memcpy from mapped ptr : 31.92
enqueueUnmap(after write) : 42949672.00
memcpy to mapped ptr : 31.59
Kernel launch latency : 52.06 us