naarad Aggregation Incorrect for QPS

If aggr_metrics=none is specified, the qps count is not correct. For instance, for the below example: It reports qps = 1 when it should be 4 and 1.

Also it only seems to plot second-level data points such that it only plots the last data points.

data.csv

1433409051000,30
1433409051200,10
1433409051300,30
1433409051400,16
1433409061000,30

naarad.conf

[Latency]
infile=data.csv
sep=,
columns=test
aggr_metrics=none


[GRAPH]
graphing_library=matplotlib

Jun 05 '15 21:06 richardhsu

Attached an email I sent earlier:

Hi,

After I explore a little bit, I don’t think the combine of “aggr_metrics=none" and qps reported correctly >could be solved easily. If we want to disable sub-second aggregation, naarad will generate one record >per timestamp into csv file. Then the qps generation logic reads the file and consider each entry as >separate record. In this case this qps is no longer qps any more but query per subsecond which does not >have much meanings.

Either rewrite the qps logic which will take some time or just forget about using qps when you don’t want >to use second aggregation.

-Tao

Jun 06 '15 16:06 feng-tao

CSV file is generated in correctly. We should fix the qps issue , otherwise it is not usable. :) I can put it as my next sprint story if nobody picks up ( a good excise for new hires though).

Aug 27 '15 23:08 zhenyun

@zhenyun I am still not clear what's incorrect here. if aggr_metrics=none, then qps cannot be calculated (since we are not aggregating anything)

Aug 28 '15 16:08 RiteshMaheshwari

In my case, I did not specify anything. The metric is custom metric (not sar output). QPS (the csv file) is wrong.

On Fri, Aug 28, 2015 at 9:31 AM, Ritesh Maheshwari <[email protected]

wrote:

@zhenyun https://github.com/zhenyun I am still not clear what's incorrect here. if aggr_metrics=none, then qps cannot be calculated (since we are not aggregating anything)

— Reply to this email directly or view it on GitHub https://github.com/linkedin/naarad/issues/337#issuecomment-135825976.

Zen (Zhenyun Zhuang, Performance Team)

Aug 28 '15 16:08 zhenyun

Okay, feel free to pick up the investigation/story since you have the full context.

On Fri, Aug 28, 2015 at 9:40 AM, zhenyun [email protected] wrote:

In my case, I did not specify anything. The metric is custom metric (not sar output). QPS (the csv file) is wrong.

On Fri, Aug 28, 2015 at 9:31 AM, Ritesh Maheshwari < [email protected]

wrote:

@zhenyun https://github.com/zhenyun I am still not clear what's incorrect here. if aggr_metrics=none, then qps cannot be calculated (since we are not aggregating anything)

— Reply to this email directly or view it on GitHub https://github.com/linkedin/naarad/issues/337#issuecomment-135825976.

Zen (Zhenyun Zhuang, Performance Team)

— Reply to this email directly or view it on GitHub https://github.com/linkedin/naarad/issues/337#issuecomment-135827922.

Aug 28 '15 16:08 RiteshMaheshwari

Sample data: cat q.csv 2015-08-25 00:00:00.0035,1 2015-08-25 00:00:01.0178,1 2015-08-25 00:00:01.0285,1 2015-08-25 00:00:01.0370,1 2015-08-25 00:00:01.0478,1 2015-08-25 00:00:01.0585,1 2015-08-25 00:00:01.0670,1 2015-08-25 00:00:01.0678,1 2015-08-25 00:00:02.0494,1 2015-08-25 00:00:02.0503,1 Config: naarad.q.conf [Query-QPS] infile=q.csv columns=query sep=,

[GRAPH] graphing_library=matplotlib Generated results: out/resources/Query-QPS.qps.csv 1440460800003,1.0 1440460801017,1.0 1440460801028,1.0 1440460801037,1.0 1440460801047,1.0 1440460801058,1.0 1440460801067,2.0

Aug 28 '15 17:08 zhenyun

I could take a look at this bug next week if no one pick it up.

Aug 28 '15 17:08 feng-tao

@zhenyun Can you take a look at this again, please install from the master branch as I used your CSV file and configuration and have this as my Query-QPS.qps.csv file:

1440460800000,1.0
1440460801000,7.0
1440460802000,2.0

Aug 28 '15 17:08 richardhsu

I'm honestly confused how I was able to get the above output as any subsequent tries gave me an error:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/rhsu/Downloads/Naarad/naarad/env/lib/python2.7/site-packages/naarad-1.0.15-py2.7.egg/naarad/utils.py", line 809, in parse_and_plot_single_metri
cs
    if metric.parse():
  File "/Users/rhsu/Downloads/Naarad/naarad/env/lib/python2.7/site-packages/naarad-1.0.15-py2.7.egg/naarad/metrics/metric.py", line 309, in parse
    aggregate_timestamp, averaging_factor = self.get_aggregation_timestamp(ts, self.aggregation_granularity)
  File "/Users/rhsu/Downloads/Naarad/naarad/env/lib/python2.7/site-packages/naarad-1.0.15-py2.7.egg/naarad/metrics/metric.py", line 193, in get_aggregation_ti
mestamp
    if granularity.lower() == 'none':
AttributeError: 'NoneType' object has no attribute 'lower'

I then went to add quickly to metric.py:

if granularity is None or granularity.lower() == 'none':
  return int(timestamp), 1
...

And then got the results that @zhenyun has. Thus we want if aggr_metrics=none is set then it just uses the timestamps to calculate QPS so if same timestamps then it aggregates. If we put aggr_metrics=second then it calculates correctly.

Questions:

Does this mean if aggr_metrics is NOT defined then it should be based on the timestamps given OR based on seconds granularity?
As @feng-tao has mentioned, if aggr_metrics=none then QPS will be incorrect. Do we want to assure that QPS is always granularity calculated by seconds? Or if aggr_metrics is set then it calculates an average of the QPS's based on aggregation? Or does it calculate as queries per minute if aggr_metrics=minute etc.

Aug 28 '15 18:08 richardhsu

Feel the default should be 'second', after all, qps stands for query per second. That is, if no "aggr_metrics" line, use second;

Aug 28 '15 19:08 zhenyun

@feng-tao We've found out that actually aggr_metrics is being used incorrectly. At some point we pushed a change in which aggregation_granularity = aggr_metrics and this is incorrect since granularity specifies second, minute, etc whereas aggr_metrics defines a set of metrics that should be aggregated.

If you could look at this sometime, that'd be great, it is based on this merge commit from your pull request: https://github.com/linkedin/naarad/commit/239c936b0140af9072abdad3439b06cc48168966.

The aggr_metrics should not be affecting the aggregation_granularity and it should be that you set aggregation_granularity=none if you don't want to aggregate (it'll aggregate by the timestamps given in the file) otherwise defaults to second for the standard aggregation across seconds.

Aug 28 '15 19:08 richardhsu

@richardhsu Thanks. I will submit a pull request to fix this. In the mean time, I think by setting aggr_metrics=second is temp work around.

Aug 28 '15 20:08 feng-tao

naarad naarad copied to clipboard

Aggregation Incorrect for QPS

naarad
naarad copied to clipboard