volley
volley copied to clipboard
Benchmark go server using go tip
The Go server is seeing relatively poor performance scaling compared to the Rust and C servers. Before reporting this as an upstream bug, we should investing how the Go server performs when using the tip version of go.
Alas, it seems as though the problem still arises on Go tip.
Posted to golang-nuts.
@jbardin points out in this reply on golang-nuts that the performance drop is probably caused by the overhead introduces by doing (e)polling instead of blocking socket reads. Continuing the discussion in #4 and #5.
jon, you could run the test again with go tip. it has better goroutines performance (http://talks.golang.org/2015/state-of-go-may.slide#8) and now that we know the problem was in the async-io we can expect even better performance for golang.
i have a doubt that golang can archive or not rust/c performance.
Has go tip changed significantly in the past two days?
Ah, you mean test go-blocking
with go tip? Sure, I'll do that now.
yep :]
Done. See https://raw.githubusercontent.com/jonhoo/volley/1d9555441a2d5fa44a712a777fd95dae1503247a/benchmark/perf.png
Performance for go-blocking
improves drastically for Go tip, almost to the point where it's as fast as the C and Rust implementations! Cool.
@jonhoo This is great, thank you for doing these benchmarks. Nice to see that Go tip is catching up. :+1:
It would be nice to see latency variance as well.
@xekoukou pushed to https://github.com/jonhoo/volley/blob/master/benchmark/plot.dat
@jonhoo i got a bit surprised by the latency of golang in the plot.dat file, it is very fast now, but the latency, omg..
but i think i know what is the problem, one thing went unnoticed, rust and c are creating one thread per connection, golang is creating one thread per cpu core.
looking into the plot.dat file the only entry of go-blocking-tip that has low latency is the one that it has the same number of connections and cpu cores(threads):
go-blocking-tip 40 40 39us 5.89us 1000000
rust 40 40 41us 6.68us 1000000
c-threaded 40 40 40us 7.91us 1000000
i don't know if there is anyway to configure golang to create one real thread per goroutine.
Well, I could increase GOMAXPROCS
, but that comes with its own set of problems unfortunately. It also shouldn't really matter; to quote the Go runtime docs:
There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit.
An interesting benchmark to see would be a C implementation using a pool of workers instead of spawning a new thread for each request. That should give us more of an apples-to-apples comparison.
An interesting benchmark to see would be a C implementation using a pool of workers instead of spawning a new thread for each request. That should give us more of an apples-to-apples comparison.
Yes, its better to do this.
Well, I could increase GOMAXPROCS, but that comes with its own set of problems unfortunately. It also shouldn't really matter; to quote the Go runtime docs:
There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit.
The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes the limit.
GOMAXPROCS with the value equal to the number of cpus only make sense when we are doing requests using golang nonblock features, so when anything blocks, the thread got a new goroutine to execute. But in 'go-blocking' we are blocking the thread and the quantity of threads in this case make sense. The go-blocking app should accept a extra argument with the number of connections, only doing this the test gonna be fair.
Well at least this is what i think, dont tested, so cant confirm.
I would make a pr, but don't know why i cant compile the c code to do the tests :[
It's not entirely clear how to interpret that statement from the docs. While it is true that we're blocking a user-level goroutine, we are also blocking on a system call, so it might be that Go is smart enough to then allow another goroutine to run. I'm not sure about this though.
Can you open another ticket with the C compilation error you're getting?