gpu-benches
gpu-benches copied to clipboard
Problem about l2 cache test
Hey, I was runing l2 cache test in my A800 80GB GPU, and i tried to modify the parameters N
, there are some strange results.
In default, N
=64, and result as follow:
256 kB 768 kB 9ms 3.6% 5663.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1024 kB 9ms 0.1% 5679.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1280 kB 9ms 0.5% 5614.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1536 kB 10ms 0.2% 5421.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1792 kB 10ms 0.2% 5485.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2048 kB 9ms 0.3% 5608.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2304 kB 9ms 0.2% 5532.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2560 kB 9ms 0.2% 5531.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2816 kB 10ms 0.4% 5473.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 3072 kB 10ms 0.1% 5468.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 3328 kB 10ms 0.1% 5423.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 3584 kB 10ms 0.2% 5407.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 3840 kB 10ms 0.1% 5433.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 4096 kB 10ms 0.2% 5492.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 4352 kB 10ms 0.1% 5411.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 4608 kB 10ms 0.1% 5416.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 4864 kB 10ms 0.2% 5382.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 5120 kB 10ms 0.5% 5297.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 5632 kB 10ms 0.4% 5441.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 6144 kB 10ms 0.1% 5461.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 6656 kB 10ms 0.1% 5435.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 7168 kB 10ms 0.3% 5412.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 7680 kB 10ms 0.1% 5414.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 8448 kB 10ms 0.1% 5469.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 9216 kB 10ms 0.2% 5442.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 9984 kB 10ms 0.3% 5313.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 10752 kB 10ms 0.1% 5376.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 11776 kB 10ms 0.3% 5459.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 12800 kB 10ms 0.1% 5447.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 14080 kB 10ms 0.2% 5424.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 15360 kB 10ms 0.3% 5287.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 16896 kB 10ms 0.1% 5394.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 18432 kB 10ms 0.3% 5416.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 20224 kB 10ms 0.4% 5307.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 22016 kB 10ms 0.1% 5416.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 24064 kB 10ms 0.1% 5374.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 26368 kB 10ms 0.2% 5354.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 28928 kB 10ms 0.8% 5161.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 31744 kB 13ms 2.3% 4095.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 34816 kB 20ms 2.2% 2602.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 38144 kB 24ms 1.3% 2197.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 41728 kB 25ms 2.1% 2123.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 45824 kB 27ms 1.7% 1948.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 50176 kB 28ms 0.2% 1897.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 55040 kB 28ms 0.2% 1893.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 60416 kB 28ms 0.3% 1891.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 66304 kB 28ms 0.4% 1891.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 72704 kB 28ms 0.6% 1890.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 79872 kB 28ms 0.5% 1888.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 87808 kB 28ms 0.4% 1885.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 96512 kB 28ms 0.8% 1889.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 105984 kB 28ms 0.5% 1880.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 116480 kB 28ms 0.5% 1877.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 128000 kB 28ms 0.8% 1881.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 140800 kB 28ms 0.6% 1871.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 154880 kB 28ms 0.6% 1867.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 170240 kB 28ms 0.9% 1870.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 187136 kB 28ms 0.6% 1858.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 205824 kB 28ms 0.6% 1854.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 226304 kB 28ms 1.0% 1853.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 248832 kB 28ms 0.7% 1841.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 273664 kB 29ms 0.8% 1836.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 300800 kB 28ms 1.4% 1840.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 330752 kB 29ms 0.9% 1821.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 363776 kB 29ms 1.5% 1824.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 400128 kB 29ms 0.9% 1805.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 440064 kB 29ms 1.0% 1797.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 483840 kB 29ms 1.6% 1795.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 532224 kB 30ms 1.1% 1770.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 585216 kB 30ms 1.2% 1759.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 643584 kB 30ms 1.5% 1755.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 707840 kB 30ms 0.1% 1730.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 778496 kB 30ms 0.1% 1730.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 856320 kB 30ms 0.1% 1730.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 941824 kB 30ms 0.1% 1730.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1035776 kB 30ms 0.1% 1730.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1139200 kB 30ms 0.2% 1734.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1253120 kB 30ms 0.1% 1730.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1378304 kB 30ms 0.1% 1730.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1516032 kB 30ms 0.2% 1733.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1667584 kB 30ms 0.3% 1734.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 1834240 kB 30ms 0.2% 1732.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2017536 kB 30ms 0.1% 1731.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2219264 kB 30ms 0.1% 1730.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
256 kB 2440960 kB 30ms 0.2% 1732.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
When I change N
from 64 to 512, result as follows:
4096 kB 12288 kB 154ms 0.1% 5447.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 16384 kB 153ms 0.1% 5471.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 20480 kB 155ms 0.2% 5418.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 24576 kB 154ms 0.4% 5435.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 28672 kB 161ms 2.2% 5207.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 32768 kB 164ms 2.1% 5117.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 36864 kB 164ms 2.1% 5103.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 40960 kB 166ms 2.3% 5051.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 45056 kB 166ms 1.5% 5045.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 49152 kB 167ms 1.0% 5012.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 53248 kB 167ms 2.6% 5023.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 57344 kB 168ms 1.9% 5000.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 61440 kB 168ms 1.9% 5003.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 65536 kB 167ms 1.9% 5008.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 69632 kB 167ms 2.0% 5032.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 73728 kB 168ms 2.5% 4996.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 77824 kB 171ms 1.3% 4903.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 81920 kB 200ms 4.2% 4201.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 90112 kB 199ms 1.2% 4220.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 98304 kB 210ms 51.8% 3991.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 106496 kB 213ms 58.6% 3936.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 114688 kB 425ms 3.1% 1975.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 122880 kB 439ms 0.5% 1909.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 135168 kB 440ms 0.7% 1905.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 147456 kB 446ms 0.3% 1879.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 159744 kB 442ms 1.0% 1896.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 172032 kB 448ms 0.4% 1870.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 188416 kB 449ms 0.2% 1867.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 204800 kB 450ms 0.1% 1864.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 225280 kB 452ms 0.2% 1855.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 245760 kB 451ms 0.1% 1860.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 270336 kB 451ms 0.1% 1859.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 294912 kB 452ms 0.1% 1856.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 323584 kB 451ms 0.1% 1859.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 352256 kB 451ms 0.1% 1858.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 385024 kB 452ms 0.1% 1857.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 421888 kB 453ms 0.2% 1853.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 462848 kB 453ms 0.1% 1851.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 507904 kB 455ms 0.1% 1845.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 557056 kB 452ms 0.2% 1854.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 610304 kB 453ms 0.2% 1851.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 667648 kB 453ms 0.2% 1852.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 733184 kB 454ms 0.2% 1846.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 802816 kB 453ms 0.2% 1850.6 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 880640 kB 454ms 0.1% 1847.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 966656 kB 455ms 0.2% 1842.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1060864 kB 455ms 0.2% 1845.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1163264 kB 455ms 0.2% 1842.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1277952 kB 455ms 0.2% 1842.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1404928 kB 456ms 0.2% 1839.1 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1544192 kB 457ms 0.3% 1834.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1695744 kB 458ms 0.3% 1830.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 1863680 kB 459ms 0.3% 1828.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 2048000 kB 459ms 0.3% 1827.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 2252800 kB 460ms 0.3% 1822.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 2478080 kB 462ms 0.3% 1817.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 2723840 kB 462ms 0.3% 1814.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 2994176 kB 464ms 0.4% 1809.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 3293184 kB 464ms 0.4% 1809.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 3620864 kB 466ms 0.4% 1798.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 3981312 kB 467ms 0.4% 1795.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 4378624 kB 469ms 0.5% 1786.9 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 4812800 kB 471ms 0.6% 1779.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 5292032 kB 473ms 0.6% 1775.3 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 5820416 kB 476ms 0.6% 1763.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 6402048 kB 477ms 0.7% 1758.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 7041024 kB 480ms 0.7% 1747.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 7741440 kB 483ms 0.8% 1736.8 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 8515584 kB 487ms 0.8% 1724.2 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 9363456 kB 489ms 0.9% 1713.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 10297344 kB 494ms 0.4% 1698.5 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 11325440 kB 495ms 0.1% 1693.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 12455936 kB 496ms 0.0% 1690.4 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 13701120 kB 481ms 3.0% 1744.7 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
4096 kB 15069184 kB 484ms 2.4% 1734.0 GB/s 0 GB/s 0 GB/s 0 GB/s 0 GB/s
From result in N
=64, l2 cache size is between [26MB, 45MB], but from result in N
= 512, the l2 cache size is out of this range.
What does N
affect the test result? Because when the value of the second column is 60416KB(~60MB), bandwidth in N
=64 is 1891.3 GB/s
, however bandwidth in N
=512 is ~5003.1 GB/s