rbkit
rbkit copied to clipboard
Explore ways to compress event_collection messages
Currently, the msgpack data that we are sending across to the client is uncompressed. This means an objectspace dump can come to 100s of MBs , and running a sampling profiler for 10 minutes or so can come to GBs of data ! I'm exploring ways to compress the events we are sending.
Points to note :
- Compressing and uncompressing should be as fast as possible to cause minimal overhead. I've chosen LZ4 as the algorithm for its promising benchmark results.
- Compressing will be most fruitful if we have a good chunk of data to compress. It's better to compress event_collection messages which aggregates a lot of messages in it.
I've added LZ4 compression on msgpack data for event_collection messages just before sending the data out over zmq in this commit : https://github.com/code-mancers/rbkit/commit/c3dcfe72d26cb3bc9f45b00c0e6847808464cc39 . The results look very promising:
Object space dumps in a smallish Rails app gets around 77% of size savings consistently. CPU samples get a whopping 90% savings and other events also get around 70-80% size reduction.
Related to this, will there be a noticeable change if the keys for events were strings now?
On Sat, 11 Jul 2015 at 15:21 Emil Soman [email protected] wrote:
Currently, the msgpack data that we are sending across to the client is uncompressed. This means an objectspace dump can come to 100s of MBs , and running a sampling profiler for 10 minutes or so can come to GBs of data ! I'm exploring ways to compress the events we are sending.
Points to note :
- Compressing and uncompressing should be as fast as possible to cause minimal overhead. I've chosen LZ4 as the algorithm for its promising benchmark results.
- Compressing will be most fruitful if we have a good chunk of data to compress. It's better to compress event_collection messages which aggregates a lot of messages in it.
I've added LZ4 compression on msgpack data for event_collection messages just before sending the data out over zmq in this commit : c3dcfe7 https://github.com/code-mancers/rbkit/commit/c3dcfe72d26cb3bc9f45b00c0e6847808464cc39 . The results look very promising:
Object space dumps in a smallish Rails app gets around 77% of size savings consistently. CPU samples get a whopping 90% savings and other events also get around 70-80% size reduction.
— Reply to this email directly or view it on GitHub https://github.com/code-mancers/rbkit/issues/125.
Not sure about that, but can't beat numeric keys for sure.
@iffyuva @ishankhare07 once we're done with showing CPU profiling on the UI, we'll explore this a bit more and see if this can become a bottleneck in the client.
@emilsoman agreed, not a priority
If speed of compression is concern here, then you can use https://github.com/google/snappy See also http://facebook.github.io/zstd/
@stereobooster we have already evaluated these and decided compression in not a priority till we have a usable profiling feature for development environment. Because my focus is on other projects, I'm not actively working on any rbkit features atm. PRs are welcome if you're interested in contributing. Thanks!