gozmq
gozmq copied to clipboard
Add simple zero copy support
The simplest thing I managed to come up with is to simply remember the data still being used in a map in the Go part so that it's not GC'd. In the callback it's then enough to just delete the record in the map and we are done. Unfortunately there is some synchronization needed, creating a possible bottleneck, but who knows, maybe not. And it's locking in the callback, so I don't know what scenario would really break this down. It should be fine and enough.
As an improvement it could be possible to simply pass the key as the integer, i.e. don't malloc space for the integer and pass the pointer, but simply cast the integer to pointer and pass that one...
I also don't like the global state in the Go module, I will think about it a bit more, if it is possible to get rid of that. I don't think that you can possibly manage to fill the map with ID's even if you use several contexts, but it's just ugly...
Btw tests and other stuff left out for now, we should first get this into a usable state and then take care of the rest I think.
In fact I haven't even tried to run it, just to compile it :P
Signed-off-by: Ondrej Kupka [email protected]
Zero-copy is pretty cool, nice to see you're having fun with it! I've got my hopes up :-)
Oh yeah, a lot of fun :-)
Tried the benchmarks from zmq_test.go, but cloned them and used ZeroCopySend instead of Send, for fun:
$ go test -tags zmq_3_x -bench .
PASS
BenchmarkSendReceive1Btcp 50000 49084 ns/op
BenchmarkZCSendReceive1Btcp 50000 40015 ns/op
BenchmarkSendReceive1KBtcp 50000 35361 ns/op
BenchmarkZCSendReceive1KBtcp 50000 41221 ns/op (regression :D)
BenchmarkSendReceive1MBtcp 2000 765219 ns/op
BenchmarkZCSendReceive1MBtcp 5000 648066 ns/op
BenchmarkSendReceive1Binproc 50000 48689 ns/op
BenchmarkZCSendReceive1Binproc 50000 39488 ns/op
BenchmarkSendReceive1KBinproc 50000 50928 ns/op
BenchmarkZCSendReceive1KBinproc 50000 41030 ns/op
BenchmarkSendReceive1MBinproc 2000 767550 ns/op
BenchmarkZCSendReceive1MBinproc 5000 653546 ns/op
BenchmarkSendReceive10MBinproc 100 13035866 ns/op
BenchmarkZCSendReceive10MBinproc 200 9064803 ns/op
BenchmarkSendReceive100MBinproc 10 131525988 ns/op
BenchmarkZCSendReceive100MBinproc 20 84748663 ns/op
ok gozmq 39.567s
The differences are often very small, so I am not sure how much proving that is, but for large messages even this simply implementation shows some improvements I guess. Also the TCP tests are there more or less for fun, because I think that the TCP stack must have much more overhead that what we save with ZC on small messages...
Brb, it's completely broken now, I am too eager to send commits I guess...
You're right, I got 3.2.3 mixed up with 3.3: http://api.zeromq.org/3-3:zmq-socket-monitor where this pointer issue is fixed. What do you think of sticking with the zeromq convention that the caller is responsible for providing the callback e.g. so that other resources associated with a message can be freed? It might be simpler than synchronizing access to a shared map since a near-empty callback can block GC on a slice. Nice work on the benchmarking btw.
Good idea, I will try that. The only issue I am constantly fighting with is what is possible to do in CGO. Like very often passing callbacks between Go and C is not possible, many things are not possible. But I will do my best using casting and other regular brute methods ;-)
Sorry, I am a bit out of time right now, will continue on this ASAP.
I was just thinking... If we want to copy that zmq_msg_init_data
, what type shall we choose for the hint? I guess we can either limit it, like give me and integer and I will give it back in the callback, or let the user do everything, like yeah, I will accept unsafe.Pointer, but you C.malloc the memory and you C.free it after. Not sure if it is a good practice to use unsafe.Pointer and force the users to use C in your API, but it is clearly more flexible. Do we need such flexibility or and integer as the hint would be enough?
Hmm, I would need to rebase this so that Travis passes. Is that possible with GitHub or not? I can tell you that everything passed with Go 1.1 on my computer ;-)
Ah, cool, it rebases automatically. Travis passes now :-)
I just noticed this will only compile in Go 1.1 and up. That's not a problem for me, but maybe we can separate the 1.1-specific code into another file with build tag "go1.1". At least the use of ZCFrameBuffer in zmq_test.go, but maybe the ZCFrameBuffer type itself?
Sounds reasonable to me.
Hi, I am rather busy these days, but I will try to take a look.