gohs icon indicating copy to clipboard operation
gohs copied to clipboard

gohs in multicore

Open venkatsvpr opened this issue 1 year ago • 0 comments

I ran these benchmarks on a multicore setup (32 cores) with different values for GOMAXPROCS(1,4,32) and the results only roughly the same . Is this expected?

As mentioned in the link the performance should increases linearly with the added cores.

GOMAXPROCS:1

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16             1715931               670.0 ns/op        23.88 MB/s
BenchmarkHyperscanBlockScan/Easy0/32             1748773               686.7 ns/op        46.60 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K             1536878               780.2 ns/op      1312.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K             672889              1762 ns/op        18592.67 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M               32414             36524 ns/op        28709.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M                704           1745541 ns/op        19222.94 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16            2034398               591.5 ns/op        27.05 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32            1956104               610.5 ns/op        52.41 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K            1651857               725.8 ns/op      1410.95 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K            659960              1839 ns/op        17818.02 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M              30812             38900 ns/op        26955.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M               648           1845531 ns/op        18181.45 MB/s
BenchmarkHyperscanBlockScan/Easy1/16             2009086               592.2 ns/op        27.02 MB/s
BenchmarkHyperscanBlockScan/Easy1/32             1935408               612.1 ns/op        52.28 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K             1572076               766.6 ns/op      1335.74 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K             394917              3078 ns/op        10645.49 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M               16137             74424 ns/op        14089.14 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M                435           2711804 ns/op        12373.47 MB/s
BenchmarkHyperscanBlockScan/Medium/16            2006866               593.9 ns/op        26.94 MB/s
BenchmarkHyperscanBlockScan/Medium/32            1921970               613.1 ns/op        52.19 MB/s
BenchmarkHyperscanBlockScan/Medium/1K            1637011               719.2 ns/op      1423.86 MB/s
BenchmarkHyperscanBlockScan/Medium/32K            682360              1727 ns/op        18976.42 MB/s
BenchmarkHyperscanBlockScan/Medium/1M              34454             34887 ns/op        30056.37 MB/s
BenchmarkHyperscanBlockScan/Medium/32M               667           1721256 ns/op        19494.16 MB/s
BenchmarkHyperscanBlockScan/Hard/16              1996935               596.3 ns/op        26.83 MB/s
BenchmarkHyperscanBlockScan/Hard/32              1935126               612.3 ns/op        52.26 MB/s
BenchmarkHyperscanBlockScan/Hard/1K              1682648               706.6 ns/op      1449.26 MB/s
BenchmarkHyperscanBlockScan/Hard/32K              717792              1721 ns/op        19035.08 MB/s
BenchmarkHyperscanBlockScan/Hard/1M                34600             34801 ns/op        30130.25 MB/s
BenchmarkHyperscanBlockScan/Hard/32M                 697           1733341 ns/op        19358.24 MB/s
BenchmarkHyperscanBlockScan/Hard1/16             1874395               631.0 ns/op        25.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32             1902772               623.8 ns/op        51.30 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K             1542068               764.3 ns/op      1339.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K             258709              4637 ns/op        7067.18 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M                9738            135416 ns/op        7743.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M                274           4452196 ns/op        7536.60 MB/s
PASS
ok      github.com/flier/gohs/bench/go  59.318s

GOMAXPROCS:4

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-4           1970245               607.7 ns/op        26.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-4           1855035               643.5 ns/op        49.72 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-4           1683074               716.0 ns/op      1430.26 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-4           648412              1718 ns/op        19070.18 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-4             35168             34605 ns/op        30301.05 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-4              913           1334215 ns/op        25149.19 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-4          1962379               587.5 ns/op        27.23 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-4          1946095               618.0 ns/op        51.78 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-4          1699917               714.9 ns/op      1432.33 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-4          654991              1799 ns/op        18218.12 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-4            31868             37621 ns/op        27872.22 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-4             838           1468084 ns/op        22855.94 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-4           1961354               587.8 ns/op        27.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-4           1940809               630.1 ns/op        50.79 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-4           1581129               739.8 ns/op      1384.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-4           403366              3033 ns/op        10802.96 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-4             16539             72278 ns/op        14507.62 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-4              474           2531663 ns/op        13253.91 MB/s
BenchmarkHyperscanBlockScan/Medium/16-4          1997407               600.3 ns/op        26.66 MB/s
BenchmarkHyperscanBlockScan/Medium/32-4          1938888               618.4 ns/op        51.74 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-4          1650643               712.5 ns/op      1437.15 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-4          704354              1702 ns/op        19252.22 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-4            35781             33807 ns/op        31016.92 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-4             939           1298811 ns/op        25834.74 MB/s
BenchmarkHyperscanBlockScan/Hard/16-4            2023584               580.0 ns/op        27.58 MB/s
BenchmarkHyperscanBlockScan/Hard/32-4            1868596               631.5 ns/op        50.67 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-4            1669840               692.5 ns/op      1478.64 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-4            698586              1719 ns/op        19064.54 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-4              35702             33578 ns/op        31227.85 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-4               861           1297823 ns/op        25854.39 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-4           1867569               620.4 ns/op        25.79 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-4           1887868               627.3 ns/op        51.01 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-4           1579549               752.1 ns/op      1361.48 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-4           261825              4515 ns/op        7257.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-4              9482            126858 ns/op        8265.73 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-4              292           4136386 ns/op        8112.02 MB/s
PASS
ok      github.com/flier/gohs/bench/go  58.567s

GOMAXPROCS: 32

goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-32                  1956862               614.6 ns/op        26.03 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-32                  1858440               647.1 ns/op        49.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-32                  1661392               724.8 ns/op      1412.78 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-32                  648123              1725 ns/op        18999.23 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-32                    34376             34736 ns/op        30186.62 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-32                     900           1360821 ns/op        24657.50 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-32                 2070141               575.2 ns/op        27.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-32                 1941291               611.2 ns/op        52.35 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-32                 1693740               704.5 ns/op      1453.57 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-32                 637003              1807 ns/op        18133.63 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-32                   31851             37592 ns/op        27893.45 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-32                    867           1559943 ns/op        21510.04 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-32                  2054518               577.5 ns/op        27.71 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-32                  1923642               616.4 ns/op        51.91 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-32                  1586978               741.1 ns/op      1381.67 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-32                  397279              2978 ns/op        11003.24 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-32                    16591             72500 ns/op        14463.10 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-32                     495           2421523 ns/op        13856.75 MB/s
BenchmarkHyperscanBlockScan/Medium/16-32                 2026708               593.6 ns/op        26.95 MB/s
BenchmarkHyperscanBlockScan/Medium/32-32                 1905799               614.2 ns/op        52.10 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-32                 1653423               712.5 ns/op      1437.25 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-32                 675596              1691 ns/op        19373.48 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-32                   34756             33595 ns/op        31211.97 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-32                    924           1302569 ns/op        25760.19 MB/s
BenchmarkHyperscanBlockScan/Hard/16-32                   1949880               584.0 ns/op        27.40 MB/s
BenchmarkHyperscanBlockScan/Hard/32-32                   1889216               618.6 ns/op        51.73 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-32                   1655174               702.3 ns/op      1458.03 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-32                   669544              1711 ns/op        19150.83 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-32                     35607             33587 ns/op        31219.47 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-32                      860           1366813 ns/op        24549.40 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-32                  1902019               625.6 ns/op        25.57 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-32                  1895744               625.7 ns/op        51.14 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-32                  1573185               755.7 ns/op      1355.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-32                  260739              4520 ns/op        7249.41 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-32                     9322            126155 ns/op        8311.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-32                     274           4130479 ns/op        8123.62 MB/s
PASS
ok      github.com/flier/gohs/bench/go  58.398s

Machine details:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Stepping:            6
CPU MHz:             2801.963
CPU max MHz:         2800.0000
CPU min MHz:         800.0000
BogoMIPS:            5586.87
Virtualization:      VT-x
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           48K
L1i cache:           32K
L2 cache:            1280K
L3 cache:            49152K
NUMA node0 CPU(s):   0-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid arch_capabilities

venkatsvpr avatar Jun 16 '23 09:06 venkatsvpr