DLA-Future icon indicating copy to clipboard operation
DLA-Future copied to clipboard

Reduction to band miniapp prints negative flops when matrix size is equal to band size

Open msimberg opened this issue 2 years ago • 0 comments

For example:

[0]
[0] 0.00350707s -204.11GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[1]
[1] 0.000240678s -2974.21GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[2]
[2] 1.6993e-05s -42124.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[3]
[3] 1.4077e-05s -50850.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU
[4]
[4] 1.7594e-05s -40685.9GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU

This is due to this calculation being an approximation: https://github.com/eth-cscs/DLA-Future/blob/f467da9ca59d71908ddb4f9fe853d2913838c56d/miniapp/miniapp_reduction_to_band.cpp#L135. According to @rasolca:

note that the correct calculation (i.e. not just the high order term) would return a NaN (as the flop are 0)

msimberg avatar Aug 31 '23 14:08 msimberg