feinstaub-map icon indicating copy to clipboard operation
feinstaub-map copied to clipboard

Use median instead of mean

Open joshmh opened this issue 6 years ago • 3 comments

Oftentimes, some cells on the map will become orange or red just because of one faulty device. This gives an unrealistic view of air quality in the region. Using median instead of mean is a more robust statistic and automatically filters out extreme values, while still representing the consensus of sensors in the region.

joshmh avatar May 21 '18 10:05 joshmh

Here is a before and after comparison, top is with mean, bottom is with median.

screen shot 2018-05-21 at 1 25 58 pm screen shot 2018-05-21 at 1 26 09 pm

joshmh avatar May 21 '18 10:05 joshmh

And here is an example of the offending sensor (Dresden area):

screen shot 2018-05-21 at 1 28 28 pm

joshmh avatar May 21 '18 10:05 joshmh

Not "Use median instead of mean". For air quality, extreme values are the most important (if they aren't outliers). 1) Actually, there should be options to display mean and max values+ percentiles (25-50-75-90-95) (thus including median). 2) If a max value seems to be unlikely (sensor problem), then the sensor should be verified and the data deleted. If the sensor is correct, it is extremely important to keep the value (and still calculating mean with it).

BrunoKestemont avatar Jan 28 '19 16:01 BrunoKestemont