Flotilla icon indicating copy to clipboard operation
Flotilla copied to clipboard

Using tinystat

Open savorywatt opened this issue 10 years ago • 2 comments

https://github.com/codahale/tinystat

Took a look over the tinystat lib and thought I'd start the discussion on what the command line api might look like. The general idea to me is to take everything from printSummery and printResults in the client and dump it into a struct that can be dumped out in different formats (json, gob etc.). After a test run it can be compared automatically if passed in at the beginning of the test run or you can save out the results of a test run and then use the 'stat' tool to create a comparison struct (the results of doing the tinystat). If you provide multiple runs it will compare them all against each other.

You could then take that result and graph it using a number of outputs (svg, html etc.) Turn it into more of a generation process for the raw data and then layers of views on top of that. You could output graph jpgs that could then be put into a nicely templated html page or you could create svgs etc.

flotilla-client --previous-stats testrun5.blob
flotilla-client stats -gobfiles=run1.gob, run2.gob, run3.gob -gobout=difference.blob
flotilla-client stats -htmlgraph=difference.blob
flotilla-client stats -htmlgraph=testrun5.blob

The code change I propose would be to first have the printSummary and printResults calls at the end of the test write to a struct that can be written to disk.

I have been looking for a generic way to compare benchmark runs easily (programmatically too) and this could be very useful in other places.

@tylertreat more to chew on

savorywatt avatar Jan 15 '15 03:01 savorywatt

Yeah, many thanks to @codahale for pinging me re: tinystat and usl because they look really interesting/applicable.

Right now, the daemons only send back to the client a few data points from the HdrHistogram (min, max, quartiles, std dev, etc.). To compare benchmarks with tinystat, we'd conceivably need to capture the complete latency distribution (this would also be useful for plotting that distribution along the lines of this). I added a way to export/import histograms, so it is possible to serialize those.

I guess the question is whether it makes sense to create the summaries on the daemon and ship those back or ship the complete data set back. The latter probably makes more sense because it allows the client to do what it wants with the raw data. I like the idea of turning it into a generation process and layering views on top.

tylertreat avatar Jan 15 '15 04:01 tylertreat

Yeah, I did some reading on USL tonight, lots of meat there. A little complicated but it looks really useful.

It probably makes sense to just start with your snapshot step and have that sent back to the client or posted somewhere else. Then the client can collect them from each daemon for further processing. Theoretically depending on the format the snapshot is in we could do stats based on that to start with.

I'm basing the above on my naive understanding of what you did in that PR as I'm not versed in the hdr lib's internals.

savorywatt avatar Jan 15 '15 05:01 savorywatt