ha-average
ha-average copied to clipboard
median
I would suggest to implement "median" from defined number of readings per time as an option. at this moment, this could be easily achieved by averaging large set of values, but this is not suitable for some applitations.
Example of state where average is not enough - battery powered water level sensor, which sends 10 measurement sequence every 2 hours. 10 measurements are because sensor readings are varying on water level movements. Sensor is ultrasonic, so some of readings are completely out of range because of bounces etc, e.g. my sewage reports it is 9m deep from time to time, so average varies a lot. Median from 10 measurements every 2 hours provides almost perfect result.
#9:
The calculation of the median requires storing in memory all variants of the values. And this can very negatively affect the work of HA.
I've had a look because I'm interested in using the median also. But in my case it's to average several sensors and not over time. If I understand the code well, it seems it would only needs to change line 491 then.
I'll try to test this on my installation this weekend, and if it works as expected I'll come back to you.
@jere19, Alas, calculating the median is more difficult than simply change the logic of line 491. Look at the description of the median calculation logic in Wikipedia: https://en.wikipedia.org/wiki/Median#Finite_data_set_of_numbers
Considering that this component does not work with datasets, but in fact with areas of geometric shapes, it would be more correct to calculate medians using algorithms for such cases. And this means the need to calculate factorials, which will dramatically increase the computational costs... https://en.wikipedia.org/wiki/Median#Probability_distributions
I don't see a solution with reasonable computational costs yet.
Maybe I'm missing some of the working of the component. But for my use case, switching the line:
self._state = round(sum(values) / len(values), self._precision)
for
self._state = round(median(values), self._precision)
does correctly return the median of my 6 sensors. With the import from the standard library ( from statistics import median).
I've also added a temporal averaging to my setup to check if that doesn't break things. Computations still look fine.
I've included a screenshot of the resulting sensor. The temperature is a coherent value and the curve is smoothed.
And the median function from the statistics module takes roughly 3x the time of the classical average, so it's still reasonable. But of course, that's assuming I haven't missed a bigger problem.
Here's my config:
- platform: average name: "TemperatureInterieure" duration: #added this only to check for troubles minutes: 10 entities: - sensor.ble_temperature_bureau - sensor.ble_temperature_salon - sensor.ble_temperature_chambreparents - sensor.ble_temperature_chambrearthur - sensor.ble_temperature_chambreelliott - sensor.ble_temperature_salledebain
You end up with the wrong median. The values
array stores not the original, but the normalized sensor values. However, thanks, you gave me an idea how to implement the calculation of the median.
Beta version of median calculation: https://github.com/Limych/ha-average/tree/feature/median
Have a nice day! :)
Great ! I'm happy to have been some help. I'll try this version !
I’m thinking, is it worth making the median value smoother? Now it is calculated by choosing the closest value from the data set, which is not entirely correct. It would be more correct to calculate the median according to the principles of geometry: i.e. dividing the graph, as a polygon, into two equal parts. This means that the median should change smoothly over time.
I'm not sure it would be worth the effort. Here's a screenshot on my setup with the new branch (same configuration as earlier):