gitalk
gitalk copied to clipboard
cacheline 对 Go 程序的影响
https://colobu.com/2019/01/24/cacheline-affects-performance-in-go/
好文,在java开发中也经常需要处理缓存行填充
里面例子我运行了四次, 并不是你说的情况。
[@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench .
goos: linux
goarch: amd64
pkg: demo/cacheline
BenchmarkPad_Increase-4 5294812 232 ns/op
BenchmarkNoPad_Increase-4 10684954 112 ns/op
PASS
ok demo/cacheline 2.777s
[@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench .
goos: linux
goarch: amd64
pkg: demo/cacheline
BenchmarkPad_Increase-4 10496142 119 ns/op
BenchmarkNoPad_Increase-4 7455829 160 ns/op
PASS
ok demo/cacheline 2.726s
[@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench .
goos: linux
goarch: amd64
pkg: demo/cacheline
BenchmarkPad_Increase-4 10242184 201 ns/op
BenchmarkNoPad_Increase-4 11318866 109 ns/op
PASS
ok demo/cacheline 3.524s
[@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench .
goos: linux
goarch: amd64
pkg: demo/cacheline
BenchmarkPad_Increase-4 11115954 117 ns/op
BenchmarkNoPad_Increase-4 7965044 158 ns/op
PASS
ok demo/cacheline 2.826s
[@sjs_19_176 cacheline]$
一个完整的测试 https://play.golang.org/p/hp_MXQWEJvB
@goith 里面例子我运行了四次, 并不是你说的情况。
[@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench . goos: linux goarch: amd64 pkg: demo/cacheline BenchmarkPad_Increase-4 5294812 232 ns/op BenchmarkNoPad_Increase-4 10684954 112 ns/op PASS ok demo/cacheline 2.777s [@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench . goos: linux goarch: amd64 pkg: demo/cacheline BenchmarkPad_Increase-4 10496142 119 ns/op BenchmarkNoPad_Increase-4 7455829 160 ns/op PASS ok demo/cacheline 2.726s [@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench . goos: linux goarch: amd64 pkg: demo/cacheline BenchmarkPad_Increase-4 10242184 201 ns/op BenchmarkNoPad_Increase-4 11318866 109 ns/op PASS ok demo/cacheline 3.524s [@sjs_19_176 cacheline]$ go test -gcflags "-N -l" -bench . goos: linux goarch: amd64 pkg: demo/cacheline BenchmarkPad_Increase-4 11115954 117 ns/op BenchmarkNoPad_Increase-4 7965044 158 ns/op PASS ok demo/cacheline 2.826s [@sjs_19_176 cacheline]$
type Pad struct {
a uint64
_p1 [cpu.CacheLinePadSize - unsafe.Sizeof(uint64(1))%cpu.CacheLinePadSize]byte
b uint64
_p2 [cpu.CacheLinePadSize - unsafe.Sizeof(uint64(1))%cpu.CacheLinePadSize]byte
c uint64
_p3 [cpu.CacheLinePadSize - unsafe.Sizeof(uint64(1))%cpu.CacheLinePadSize]byte
}
用这种形式获取padding大小更准确,Mac m1的CacheLinePadSize是128,并不是传统x86的64字节。
package cpu
import "runtime"
// cacheLineSize is used to prevent false sharing of cache lines.
// We choose 128 because Apple Silicon, a.k.a. M1, has 128-byte cache line size.
// It doesn't cost much and is much more future-proof.
const cacheLineSize = 128
goos: darwin
goarch: arm64
pkg: go-praitce/go/app/benchmark/cachelinepadding
BenchmarkPad_Increase
BenchmarkPad_Increase-8 15356199 73.22 ns/op
BenchmarkNoPad_Increase
BenchmarkNoPad_Increase-8 13037071 94.33 ns/op
PASS
附上m1的测试结果