feinstaub-map
feinstaub-map copied to clipboard
Use median instead of mean
Oftentimes, some cells on the map will become orange or red just because of one faulty device. This gives an unrealistic view of air quality in the region. Using median instead of mean is a more robust statistic and automatically filters out extreme values, while still representing the consensus of sensors in the region.
Here is a before and after comparison, top is with mean, bottom is with median.
![screen shot 2018-05-21 at 1 25 58 pm](https://user-images.githubusercontent.com/4651/40303222-994c4b60-5cfa-11e8-8aa0-6e37455fa97a.png)
![screen shot 2018-05-21 at 1 26 09 pm](https://user-images.githubusercontent.com/4651/40303229-9ca72348-5cfa-11e8-817e-f5c734564af1.png)
And here is an example of the offending sensor (Dresden area):
![screen shot 2018-05-21 at 1 28 28 pm](https://user-images.githubusercontent.com/4651/40303375-33bbce1e-5cfb-11e8-9c33-20afa4d8305c.png)
Not "Use median instead of mean". For air quality, extreme values are the most important (if they aren't outliers). 1) Actually, there should be options to display mean and max values+ percentiles (25-50-75-90-95) (thus including median). 2) If a max value seems to be unlikely (sensor problem), then the sensor should be verified and the data deleted. If the sensor is correct, it is extremely important to keep the value (and still calculating mean with it).