gnocchi icon indicating copy to clipboard operation
gnocchi copied to clipboard

Ability to set non-static fill value, based on existing data in series for aggregates

Open dbalagansky opened this issue 2 years ago • 2 comments

Which version of Gnocchi are you using

$ gnocchi --version
gnocchi 7.0.7
$ gnocchi server version
+---------+-------+
| Field   | Value |
+---------+-------+
| version | 4.4.2 |
+---------+-------+

on OpenStack Xena release.

How to reproduce your problem

I'm trying to get percentage of CPU and memory utilization on a scale from 1 to 100 (for CPU that means, that percent of utilization doesn't depend on a number of vCPUs in VM).

For cpu I have this data in series:

$ gnocchi aggregates --resource-type instance --start 2022-07-27T12:00:00 '(/ (metric cpu rate:mean) 60000000000)' id=c1081247-90cb-471b-b1e3-3a927de2e042
+----------------------------------------------------+---------------------------+-------------+----------------------+
| name                                               | timestamp                 | granularity |                value |
+----------------------------------------------------+---------------------------+-------------+----------------------+
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:00:00+00:00 |       300.0 |               0.0696 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:05:00+00:00 |       300.0 |  0.06923333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:10:00+00:00 |       300.0 |  0.06856666666666666 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:15:00+00:00 |       300.0 |  0.06993333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:20:00+00:00 |       300.0 |               0.0723 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:25:00+00:00 |       300.0 |               0.0712 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:30:00+00:00 |       300.0 |  0.06833333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:35:00+00:00 |       300.0 |  0.06629166666666667 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:40:00+00:00 |       300.0 |  0.06883333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:45:00+00:00 |       300.0 |  0.07066666666666667 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:50:00+00:00 |       300.0 |  0.07073333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T12:55:00+00:00 |       300.0 |  0.07306666666666667 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:00:00+00:00 |       300.0 |               0.0712 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:05:00+00:00 |       300.0 |  0.07343333333333334 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:10:00+00:00 |       300.0 |               0.0745 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:15:00+00:00 |       300.0 |  0.07703333333333333 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:20:00+00:00 |       300.0 |  0.07103333333333334 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:25:00+00:00 |       300.0 |               0.0712 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:30:00+00:00 |       300.0 |  0.07246666666666667 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:35:00+00:00 |       300.0 |               0.0376 |
| c1081247-90cb-471b-b1e3-3a927de2e042/cpu/rate:mean | 2022-07-27T13:40:00+00:00 |       300.0 | 0.037333333333333336 |
+----------------------------------------------------+---------------------------+-------------+----------------------+

For vcpus I have this:

$ gnocchi aggregates --resource-type instance --start 2022-07-27T12:00:00 '(metric vcpus mean)' id=c1081247-90cb-471b-b1e3-3a927de2e042
+-------------------------------------------------+---------------------------+-------------+-------+
| name                                            | timestamp                 | granularity | value |
+-------------------------------------------------+---------------------------+-------------+-------+
| c1081247-90cb-471b-b1e3-3a927de2e042/vcpus/mean | 2022-07-27T12:00:00+00:00 |       300.0 |   4.0 |
| c1081247-90cb-471b-b1e3-3a927de2e042/vcpus/mean | 2022-07-27T13:00:00+00:00 |       300.0 |   4.0 |
+-------------------------------------------------+---------------------------+-------------+-------+

What is the result that you get

Now, when I try to aggregate those two series to get CPU utilization on a scale from 1 to 100, I get data points in the resulting set only for intersections of series where points are with the same timestamp:

$ gnocchi aggregates --resource-type instance --start 2022-07-27T12:00:00 '(/ (aggregate mean (/ (metric cpu rate:mean) (metric vcpus mean))) 60000000000))' id=c1081247-90cb-471b-b1e3-3a927de2e042
+------------+---------------------------+-------------+--------+
| name       | timestamp                 | granularity |  value |
+------------+---------------------------+-------------+--------+
| aggregated | 2022-07-27T12:00:00+00:00 |       300.0 | 0.0174 |
| aggregated | 2022-07-27T13:00:00+00:00 |       300.0 | 0.0178 |
+------------+---------------------------+-------------+--------+

This works as expected, as this should be fixed by using fill parameter, but I can only set a static value for fill, whereas I need missing values from the vcpus series to be filled with "same value as previous non-null/none/NaN" value, for this kind of aggregate to work properly.

Previously, this could be achieved by creating same percentage metrics with transform in sink, which got deprecated and consequetively removed.

What is result that you expected

I expect a way to get CPU and memory utilization in percent for all available data point in cpu and memory.usage series on a scale from 0 to 100 by using calculated, based on existing data in series, value for fill.

Some time ago there were attempts to fix this missing bit within ceilometer:

  • https://review.opendev.org/c/openstack/ceilometer/+/799963 for cpu_util
  • https://review.opendev.org/c/openstack/ceilometer/+/597054 for memory_util and the first one of these raised a valid question for why does percentage would be stored as cumulative value, so Gnocchi looks like proper place to add this functionality. missing

dbalagansky avatar Jul 27 '22 14:07 dbalagansky

I assume fille=dropna could be used, but you are explicitly interested in previous as well?

tobias-urdin avatar Aug 10 '22 08:08 tobias-urdin

I assume fille=dropna could be used, but you are explicitly interested in previous as well?

Yes. :)

As far as I understand, by using fill=dropna it would result in the same behaviour as above: I would only get data points in the resulting set where there are values in both series with sameish timestamp, like in the last command output I've shown (for default fill behaviour), which effectively results in losing some of the data.

dbalagansky avatar Aug 12 '22 10:08 dbalagansky