pvanalytics icon indicating copy to clipboard operation
pvanalytics copied to clipboard

Automatically detect "filler" values

Open wfvining opened this issue 4 years ago • 2 comments

Many data sets have filler values in them representing either times when the sensor or data logger was not working (i.e. 999, -999). When we know what the filler values are it is fairly straightforward to detect/remove them; however, we may not always know what values are used as "filler." We should add some methods for automatically identifying filler-values in quality.gaps.

One method for automatically identifying filler-values could be to look for identical outliers that occur repeatedly throughout the data. There are almost certainly other ways to do this. This issue can serve as a solicitation for these methods if anyone has any suggestions.

wfvining avatar Apr 23 '20 19:04 wfvining

A possible method could look for the common filler values (-999, 999, "null", etc) and return an error if none or more than one is found before going onto your suggestion of finding the most common value.

rjstephens avatar Jul 21 '20 20:07 rjstephens

@rjstephens Definitely a good idea to implement the simple filter first. I haven't gotten around to it yet. I like your suggestion to look for multiple occurrences of each value.

wfvining avatar Jul 21 '20 21:07 wfvining