influxdb icon indicating copy to clipboard operation
influxdb copied to clipboard

Mean of circular quantities

Open wfjm opened this issue 4 years ago • 4 comments

Proposal: Add a function which calculates the proper mean of circular quantities.

Current behavior: InfluxDB provides a mean() function, good for linear quantities, but not the proper approach for circular quantities, like angles. The "mean" value of 350 degree and 20 degree should give about 10 degree, and obviously not 185 degree, as mean() would return.

Desired behavior: The proper way of calculating the mean of circular quantities is well known, see https://en.wikipedia.org/wiki/Mean_of_circular_quantities, and is in a nutshell

  atan2 ( mean (sin(x), mean (cos(x) )

A circular_mean() function would most likely have to more arguments

  • to control the scale (e.g. radian, degree)
  • to control whether the result is [0,scale[ or [-.5*scale,+0.5*scale[

Use case: Currently it's cumbersome to treat angles in InfluxDB. Angles are quite common in technical measurements (e.g. orientation of a device). Even telegraf generates them out-of-the-box (the wind direction returned by the openweathermap input plugin).

wfjm avatar Jun 06 '20 17:06 wfjm

@wfjm thanks for opening this. sounds like this would make a great Flux library.

russorat avatar Jun 26 '20 23:06 russorat

I work in wind energy and we're using influxdb. This is a key issue!

aclerc avatar Jun 15 '22 16:06 aclerc

So I think I figured out how to implement this as a custom aggregation function

import "math"

circularMean = (tables=<-, column) => 
    tables 
        |> reduce(
            identity: { count: 0.0, sumX: 0.0, sumY: 0.0, avg: 0.0 },
            fn: (r, accumulator) => {
                x = math.cos(x: r._value)
                y = math.sin(x: r._value)
                return {
                    count: accumulator.count + 1.0,
                    sumX: accumulator.sumX + x,
                    sumY: accumulator.sumY + y,
                    avg: math.atan2(
                        x: (accumulator.sumX + x) / (accumulator.count + 1.0),
                        y: (accumulator.sumY + y) / (accumulator.count + 1.0)
                    )
                }
            }
        )
        |> drop(columns: ["sumX", "sumY", "count"])        
        |> rename(columns: {avg: column})

Which can be then used as

mydata
  |> aggregateWindow(every: v.windowPeriod, fn: circularMean, createEmpty: false)

I'm not 100% sure about this implementation:

  • I couldn't figure how to use the column parameter for the input instead of hardcoding _value (which should work in most cases?)
  • I think I mixed up cos and sin, but it yields correct results in my case (maybe because my input is already messed up 🤷‍♂️)
  • It expects a _value column in radian in the input table, and will output the circular mean in the [-.5*scale,+0.5*scale[ which you might want to adjust via map afterwards

In any case I hope it can be of use to others, at least until a better implementation is added to the built-ins.

godric avatar Jun 22 '22 21:06 godric

@godric That implementation looks good to me.

I couldn't figure how to use the column parameter for the input instead of hardcoding _value (which should work in most cases?)

Also this is currently not possible in Flux so make sense you couldn't figure it out. However we are working on an update to Flux's type system and syntax that would make this possible in the future.

nathanielc avatar Jun 24 '22 22:06 nathanielc