pvanalytics
pvanalytics copied to clipboard
Automatically detect "filler" values
Many data sets have filler values in them representing either times when the sensor or data logger was not working (i.e. 999, -999). When we know what the filler values are it is fairly straightforward to detect/remove them; however, we may not always know what values are used as "filler." We should add some methods for automatically identifying filler-values in quality.gaps
.
One method for automatically identifying filler-values could be to look for identical outliers that occur repeatedly throughout the data. There are almost certainly other ways to do this. This issue can serve as a solicitation for these methods if anyone has any suggestions.
A possible method could look for the common filler values (-999, 999, "null", etc) and return an error if none or more than one is found before going onto your suggestion of finding the most common value.
@rjstephens Definitely a good idea to implement the simple filter first. I haven't gotten around to it yet. I like your suggestion to look for multiple occurrences of each value.