rbkit icon indicating copy to clipboard operation
rbkit copied to clipboard

Explore ways to compress event_collection messages

Open emilsoman opened this issue 9 years ago • 6 comments

Currently, the msgpack data that we are sending across to the client is uncompressed. This means an objectspace dump can come to 100s of MBs , and running a sampling profiler for 10 minutes or so can come to GBs of data ! I'm exploring ways to compress the events we are sending.

Points to note :

  1. Compressing and uncompressing should be as fast as possible to cause minimal overhead. I've chosen LZ4 as the algorithm for its promising benchmark results.
  2. Compressing will be most fruitful if we have a good chunk of data to compress. It's better to compress event_collection messages which aggregates a lot of messages in it.

I've added LZ4 compression on msgpack data for event_collection messages just before sending the data out over zmq in this commit : https://github.com/code-mancers/rbkit/commit/c3dcfe72d26cb3bc9f45b00c0e6847808464cc39 . The results look very promising:

Object space dumps in a smallish Rails app gets around 77% of size savings consistently. CPU samples get a whopping 90% savings and other events also get around 70-80% size reduction.

emilsoman avatar Jul 11 '15 09:07 emilsoman

Related to this, will there be a noticeable change if the keys for events were strings now?

On Sat, 11 Jul 2015 at 15:21 Emil Soman [email protected] wrote:

Currently, the msgpack data that we are sending across to the client is uncompressed. This means an objectspace dump can come to 100s of MBs , and running a sampling profiler for 10 minutes or so can come to GBs of data ! I'm exploring ways to compress the events we are sending.

Points to note :

  1. Compressing and uncompressing should be as fast as possible to cause minimal overhead. I've chosen LZ4 as the algorithm for its promising benchmark results.
  2. Compressing will be most fruitful if we have a good chunk of data to compress. It's better to compress event_collection messages which aggregates a lot of messages in it.

I've added LZ4 compression on msgpack data for event_collection messages just before sending the data out over zmq in this commit : c3dcfe7 https://github.com/code-mancers/rbkit/commit/c3dcfe72d26cb3bc9f45b00c0e6847808464cc39 . The results look very promising:

Object space dumps in a smallish Rails app gets around 77% of size savings consistently. CPU samples get a whopping 90% savings and other events also get around 70-80% size reduction.

— Reply to this email directly or view it on GitHub https://github.com/code-mancers/rbkit/issues/125.

kgrz avatar Jul 11 '15 09:07 kgrz

Not sure about that, but can't beat numeric keys for sure.

emilsoman avatar Jul 17 '15 13:07 emilsoman

@iffyuva @ishankhare07 once we're done with showing CPU profiling on the UI, we'll explore this a bit more and see if this can become a bottleneck in the client.

emilsoman avatar Jul 21 '15 06:07 emilsoman

@emilsoman agreed, not a priority

iffyuva avatar Jul 21 '15 06:07 iffyuva

If speed of compression is concern here, then you can use https://github.com/google/snappy See also http://facebook.github.io/zstd/

stereobooster avatar Sep 07 '17 19:09 stereobooster

@stereobooster we have already evaluated these and decided compression in not a priority till we have a usable profiling feature for development environment. Because my focus is on other projects, I'm not actively working on any rbkit features atm. PRs are welcome if you're interested in contributing. Thanks!

emilsoman avatar Sep 08 '17 10:09 emilsoman