gohs
gohs copied to clipboard
gohs in multicore
I ran these benchmarks on a multicore setup (32 cores) with different values for GOMAXPROCS(1,4,32) and the results only roughly the same . Is this expected?
As mentioned in the link the performance should increases linearly with the added cores.
GOMAXPROCS:1
goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16 1715931 670.0 ns/op 23.88 MB/s
BenchmarkHyperscanBlockScan/Easy0/32 1748773 686.7 ns/op 46.60 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K 1536878 780.2 ns/op 1312.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K 672889 1762 ns/op 18592.67 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M 32414 36524 ns/op 28709.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M 704 1745541 ns/op 19222.94 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16 2034398 591.5 ns/op 27.05 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32 1956104 610.5 ns/op 52.41 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K 1651857 725.8 ns/op 1410.95 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K 659960 1839 ns/op 17818.02 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M 30812 38900 ns/op 26955.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M 648 1845531 ns/op 18181.45 MB/s
BenchmarkHyperscanBlockScan/Easy1/16 2009086 592.2 ns/op 27.02 MB/s
BenchmarkHyperscanBlockScan/Easy1/32 1935408 612.1 ns/op 52.28 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K 1572076 766.6 ns/op 1335.74 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K 394917 3078 ns/op 10645.49 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M 16137 74424 ns/op 14089.14 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M 435 2711804 ns/op 12373.47 MB/s
BenchmarkHyperscanBlockScan/Medium/16 2006866 593.9 ns/op 26.94 MB/s
BenchmarkHyperscanBlockScan/Medium/32 1921970 613.1 ns/op 52.19 MB/s
BenchmarkHyperscanBlockScan/Medium/1K 1637011 719.2 ns/op 1423.86 MB/s
BenchmarkHyperscanBlockScan/Medium/32K 682360 1727 ns/op 18976.42 MB/s
BenchmarkHyperscanBlockScan/Medium/1M 34454 34887 ns/op 30056.37 MB/s
BenchmarkHyperscanBlockScan/Medium/32M 667 1721256 ns/op 19494.16 MB/s
BenchmarkHyperscanBlockScan/Hard/16 1996935 596.3 ns/op 26.83 MB/s
BenchmarkHyperscanBlockScan/Hard/32 1935126 612.3 ns/op 52.26 MB/s
BenchmarkHyperscanBlockScan/Hard/1K 1682648 706.6 ns/op 1449.26 MB/s
BenchmarkHyperscanBlockScan/Hard/32K 717792 1721 ns/op 19035.08 MB/s
BenchmarkHyperscanBlockScan/Hard/1M 34600 34801 ns/op 30130.25 MB/s
BenchmarkHyperscanBlockScan/Hard/32M 697 1733341 ns/op 19358.24 MB/s
BenchmarkHyperscanBlockScan/Hard1/16 1874395 631.0 ns/op 25.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32 1902772 623.8 ns/op 51.30 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K 1542068 764.3 ns/op 1339.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K 258709 4637 ns/op 7067.18 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M 9738 135416 ns/op 7743.36 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M 274 4452196 ns/op 7536.60 MB/s
PASS
ok github.com/flier/gohs/bench/go 59.318s
GOMAXPROCS:4
goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-4 1970245 607.7 ns/op 26.33 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-4 1855035 643.5 ns/op 49.72 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-4 1683074 716.0 ns/op 1430.26 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-4 648412 1718 ns/op 19070.18 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-4 35168 34605 ns/op 30301.05 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-4 913 1334215 ns/op 25149.19 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-4 1962379 587.5 ns/op 27.23 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-4 1946095 618.0 ns/op 51.78 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-4 1699917 714.9 ns/op 1432.33 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-4 654991 1799 ns/op 18218.12 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-4 31868 37621 ns/op 27872.22 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-4 838 1468084 ns/op 22855.94 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-4 1961354 587.8 ns/op 27.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-4 1940809 630.1 ns/op 50.79 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-4 1581129 739.8 ns/op 1384.22 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-4 403366 3033 ns/op 10802.96 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-4 16539 72278 ns/op 14507.62 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-4 474 2531663 ns/op 13253.91 MB/s
BenchmarkHyperscanBlockScan/Medium/16-4 1997407 600.3 ns/op 26.66 MB/s
BenchmarkHyperscanBlockScan/Medium/32-4 1938888 618.4 ns/op 51.74 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-4 1650643 712.5 ns/op 1437.15 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-4 704354 1702 ns/op 19252.22 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-4 35781 33807 ns/op 31016.92 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-4 939 1298811 ns/op 25834.74 MB/s
BenchmarkHyperscanBlockScan/Hard/16-4 2023584 580.0 ns/op 27.58 MB/s
BenchmarkHyperscanBlockScan/Hard/32-4 1868596 631.5 ns/op 50.67 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-4 1669840 692.5 ns/op 1478.64 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-4 698586 1719 ns/op 19064.54 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-4 35702 33578 ns/op 31227.85 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-4 861 1297823 ns/op 25854.39 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-4 1867569 620.4 ns/op 25.79 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-4 1887868 627.3 ns/op 51.01 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-4 1579549 752.1 ns/op 1361.48 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-4 261825 4515 ns/op 7257.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-4 9482 126858 ns/op 8265.73 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-4 292 4136386 ns/op 8112.02 MB/s
PASS
ok github.com/flier/gohs/bench/go 58.567s
GOMAXPROCS: 32
goarch: amd64
pkg: github.com/flier/gohs/bench/go
cpu: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
BenchmarkHyperscanBlockScan/Easy0/16-32 1956862 614.6 ns/op 26.03 MB/s
BenchmarkHyperscanBlockScan/Easy0/32-32 1858440 647.1 ns/op 49.45 MB/s
BenchmarkHyperscanBlockScan/Easy0/1K-32 1661392 724.8 ns/op 1412.78 MB/s
BenchmarkHyperscanBlockScan/Easy0/32K-32 648123 1725 ns/op 18999.23 MB/s
BenchmarkHyperscanBlockScan/Easy0/1M-32 34376 34736 ns/op 30186.62 MB/s
BenchmarkHyperscanBlockScan/Easy0/32M-32 900 1360821 ns/op 24657.50 MB/s
BenchmarkHyperscanBlockScan/Easy0i/16-32 2070141 575.2 ns/op 27.82 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32-32 1941291 611.2 ns/op 52.35 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1K-32 1693740 704.5 ns/op 1453.57 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32K-32 637003 1807 ns/op 18133.63 MB/s
BenchmarkHyperscanBlockScan/Easy0i/1M-32 31851 37592 ns/op 27893.45 MB/s
BenchmarkHyperscanBlockScan/Easy0i/32M-32 867 1559943 ns/op 21510.04 MB/s
BenchmarkHyperscanBlockScan/Easy1/16-32 2054518 577.5 ns/op 27.71 MB/s
BenchmarkHyperscanBlockScan/Easy1/32-32 1923642 616.4 ns/op 51.91 MB/s
BenchmarkHyperscanBlockScan/Easy1/1K-32 1586978 741.1 ns/op 1381.67 MB/s
BenchmarkHyperscanBlockScan/Easy1/32K-32 397279 2978 ns/op 11003.24 MB/s
BenchmarkHyperscanBlockScan/Easy1/1M-32 16591 72500 ns/op 14463.10 MB/s
BenchmarkHyperscanBlockScan/Easy1/32M-32 495 2421523 ns/op 13856.75 MB/s
BenchmarkHyperscanBlockScan/Medium/16-32 2026708 593.6 ns/op 26.95 MB/s
BenchmarkHyperscanBlockScan/Medium/32-32 1905799 614.2 ns/op 52.10 MB/s
BenchmarkHyperscanBlockScan/Medium/1K-32 1653423 712.5 ns/op 1437.25 MB/s
BenchmarkHyperscanBlockScan/Medium/32K-32 675596 1691 ns/op 19373.48 MB/s
BenchmarkHyperscanBlockScan/Medium/1M-32 34756 33595 ns/op 31211.97 MB/s
BenchmarkHyperscanBlockScan/Medium/32M-32 924 1302569 ns/op 25760.19 MB/s
BenchmarkHyperscanBlockScan/Hard/16-32 1949880 584.0 ns/op 27.40 MB/s
BenchmarkHyperscanBlockScan/Hard/32-32 1889216 618.6 ns/op 51.73 MB/s
BenchmarkHyperscanBlockScan/Hard/1K-32 1655174 702.3 ns/op 1458.03 MB/s
BenchmarkHyperscanBlockScan/Hard/32K-32 669544 1711 ns/op 19150.83 MB/s
BenchmarkHyperscanBlockScan/Hard/1M-32 35607 33587 ns/op 31219.47 MB/s
BenchmarkHyperscanBlockScan/Hard/32M-32 860 1366813 ns/op 24549.40 MB/s
BenchmarkHyperscanBlockScan/Hard1/16-32 1902019 625.6 ns/op 25.57 MB/s
BenchmarkHyperscanBlockScan/Hard1/32-32 1895744 625.7 ns/op 51.14 MB/s
BenchmarkHyperscanBlockScan/Hard1/1K-32 1573185 755.7 ns/op 1355.00 MB/s
BenchmarkHyperscanBlockScan/Hard1/32K-32 260739 4520 ns/op 7249.41 MB/s
BenchmarkHyperscanBlockScan/Hard1/1M-32 9322 126155 ns/op 8311.83 MB/s
BenchmarkHyperscanBlockScan/Hard1/32M-32 274 4130479 ns/op 8123.62 MB/s
PASS
ok github.com/flier/gohs/bench/go 58.398s
Machine details:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Stepping: 6
CPU MHz: 2801.963
CPU max MHz: 2800.0000
CPU min MHz: 800.0000
BogoMIPS: 5586.87
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 49152K
NUMA node0 CPU(s): 0-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid arch_capabilities