ha-average icon indicating copy to clipboard operation
ha-average copied to clipboard

median

Open nocturneop15 opened this issue 4 years ago • 10 comments

I would suggest to implement "median" from defined number of readings per time as an option. at this moment, this could be easily achieved by averaging large set of values, but this is not suitable for some applitations.

Example of state where average is not enough - battery powered water level sensor, which sends 10 measurement sequence every 2 hours. 10 measurements are because sensor readings are varying on water level movements. Sensor is ultrasonic, so some of readings are completely out of range because of bounces etc, e.g. my sewage reports it is 9m deep from time to time, so average varies a lot. Median from 10 measurements every 2 hours provides almost perfect result.

nocturneop15 avatar Nov 24 '20 08:11 nocturneop15

#9:

The calculation of the median requires storing in memory all variants of the values. And this can very negatively affect the work of HA.

Limych avatar Nov 24 '20 15:11 Limych

I've had a look because I'm interested in using the median also. But in my case it's to average several sensors and not over time. If I understand the code well, it seems it would only needs to change line 491 then.

I'll try to test this on my installation this weekend, and if it works as expected I'll come back to you.

jere19 avatar Jan 08 '21 09:01 jere19

@jere19, Alas, calculating the median is more difficult than simply change the logic of line 491. Look at the description of the median calculation logic in Wikipedia: https://en.wikipedia.org/wiki/Median#Finite_data_set_of_numbers

Limych avatar Jan 10 '21 12:01 Limych

Considering that this component does not work with datasets, but in fact with areas of geometric shapes, it would be more correct to calculate medians using algorithms for such cases. And this means the need to calculate factorials, which will dramatically increase the computational costs... https://en.wikipedia.org/wiki/Median#Probability_distributions

I don't see a solution with reasonable computational costs yet.

Limych avatar Jan 10 '21 12:01 Limych

Maybe I'm missing some of the working of the component. But for my use case, switching the line:

self._state = round(sum(values) / len(values), self._precision)

for

self._state = round(median(values), self._precision)

does correctly return the median of my 6 sensors. With the import from the standard library ( from statistics import median). I've also added a temporal averaging to my setup to check if that doesn't break things. Computations still look fine. I've included a screenshot of the resulting sensor. The temperature is a coherent value and the curve is smoothed. Capture d’écran de 2021-01-11 21-30-40

And the median function from the statistics module takes roughly 3x the time of the classical average, so it's still reasonable. But of course, that's assuming I haven't missed a bigger problem.

Here's my config:

  • platform: average name: "TemperatureInterieure" duration: #added this only to check for troubles minutes: 10 entities: - sensor.ble_temperature_bureau - sensor.ble_temperature_salon - sensor.ble_temperature_chambreparents - sensor.ble_temperature_chambrearthur - sensor.ble_temperature_chambreelliott - sensor.ble_temperature_salledebain

jere19 avatar Jan 11 '21 20:01 jere19

You end up with the wrong median. The values ​​array stores not the original, but the normalized sensor values. However, thanks, you gave me an idea how to implement the calculation of the median.

Limych avatar Jan 11 '21 21:01 Limych

Beta version of median calculation: https://github.com/Limych/ha-average/tree/feature/median

Have a nice day! :)

Limych avatar Jan 12 '21 18:01 Limych

Great ! I'm happy to have been some help. I'll try this version !

jere19 avatar Jan 13 '21 08:01 jere19

2021-01-13_14-59-11

I’m thinking, is it worth making the median value smoother? Now it is calculated by choosing the closest value from the data set, which is not entirely correct. It would be more correct to calculate the median according to the principles of geometry: i.e. dividing the graph, as a polygon, into two equal parts. This means that the median should change smoothly over time.

Limych avatar Jan 13 '21 12:01 Limych

I'm not sure it would be worth the effort. Here's a screenshot on my setup with the new branch (same configuration as earlier): image

jere19 avatar Jan 14 '21 12:01 jere19