pinba_engine
pinba_engine copied to clipboard
Support for counters
We currently can simulate counters with timers by stopping a timer immediately after its start and ignore the timer_value in the reports. Although, this is not ideal as some extra data are computed (timing) and we miss an absolute hit_count.
Absolute hit_count for a counter with cyclic buffers may be challenging, but I guess it could be implemented by incrementing counters in the main pool as long as they get updates. When a counter is no longer updated, the counter will be overloaded by new data.
A separate pool may also be setup with a size matching the number of distinct counters the system can track.
We currently can simulate counters with timers by stopping a timer immediately after its start and ignore the timer_value in the reports.
Why would you need to do that?
Although, this is not ideal as some extra data are computed (timing) and we miss an absolute hit_count.
Computed? It's just "start = time(); timer_value = time() - start;", there's not much to compute.
Absolute hit_count for a counter with cyclic buffers may be challenging
Absolute hit_count? There are no 'persistent' objects in Pinba to keep that absolute counter. And why in the world would you need that?
I found Pinba so scalable and easy to use that I started to want to do more than just php code benchmark with it. Maybe you read this interesting article from an Etsy guy about their way to measure anything on their architecture, inspired from another article from a Flickr guy: http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/
In think the architecture of Pinba make it a perfect replacement for those StatD deamons, because it's more scalable for several reasons:
- there won't be ever more than one stat request per page
- your dual cycle buffers make it ultra scalable
My idea with counters would be - for example - to use Pinba to track some feature usage, and detect aberrant behaviors on those counters accros releases. With absolute counters, we could even be able to do some kind of poor man accounting for some usage, or compute day by day CTR for some features.
I know nothing is persistent in Pinba, but I thought (i don't know internal details) it would be possible to increment the hit counter until the same counter gets updated regularely and thus stay in the pool (a counter wouldn't work with tags but only with a single key name). In my head, such counter would only be represented by a single record in the pool, reused and updated on each flush and with no dimensions attached to it (like script name, hostname, etc.). As it may not work with the way the main pool work, maybe a different pool could be use for this new kind of structure.
Tell me if you thing it's a good idea.
My idea with counters would be - for example - to use Pinba to track some feature usage, and detect aberrant behaviors on those counters accros releases.
That was (and is) the main idea behind Pinba, but you don't really need absolute numbers to do that - all the changes can be seen from the graphs you draw using the data from Pinba. Say if something become slower after the last release, you'll see that the total time of particular timer has grown. Same goes for the counters - the changes are obvious from the graphs.
Yes I get that, and it already work on my Pinba installation. What I'm not able to do today is to get from those counter info needed by daily reports. For instance "N hits on feature F for day D, +X.X% WoW". Do you see what I mean?
Yes, now I see it. Well, that's an area where Pinba won't help you, I'm afraid. Storing this kind of data would require ever growing amounts of memory and this is certainly not the way to go.
I understand, but what I propose, in order to keep a fixed amount of memory for this new feature, would be to create a "counter" bucket of a fixed size that would allow to maintain a fixed amount of counters. Each counters would only be key, value pairs. The values would be 32bit unsigned integers that overflows. If the number of slots in the bucket is full, new counters would overwrite the least recently updated counter in the bucket.
I know that it's not the primary purpose of Pinba, but this feature would make Pinba an ever better tool :)