Compare time used not correct when lack current data

Open huss opened this issue 5 years ago • 1 comments

The compare graph uses the current time to decide the range of data to sum for display (see #476 for description of how this is done). If the meter data in OED is not up to date, the current (labeled this on graph) sum will include missing data that will be treated as zero. Presumably, the previous (labeled last on graph) data will be present so the comparison is not fair. This will also be an issue if the clock on the user's machine is off. One way to fix this would be to use the time of the latest meter data OED actually has as the current time. However, different meters may have different latest readings. Since this is hopefully a rare issue, I think just using a different time for each one is fine and hope the range does not vary too much (need to be careful about the label proposed for compare graphs in #477 as this may mean labeling each graph in this case or always). We should think this through to be sure we are going to implement what we want. Note issue #404 is a special case of this when there is no data for the current period.

Apr 10 '20 14:04 huss

Note issue #404 was closed in favor of this issue. There are several ways to deal with this:

We use the current time as we do now but the sum is normalized back to the expected value by increasing the value by the time to the last point read. Since the usage is not expected to be constant, this normalization is likely to change the real behavior. This is mostly likely to be pronounced when comparing for one day.
We shift the end time back to the last reading that is available. This means the current reading impacts the time range of the previous reading. There is still the issue that missing points could exist in the current time range before the last reading and there could be missing points at any point in the previous time range. Given this, we should probably normalize as described in idea 1 but for any missing points and note in the help documentation how missing values are dealt with. This is still different than idea 1 because the chance that there are missing points is reduced. Note the fast-pt for lines do average over missing points and points that vary in time length so this idea could be modified to use here to get that result directly from the database.
Similar to idea 2 but we remove any missing points from the current or previous time so they sum the same times. This is probably the truest comparison but it is unclear how easy it is to implement.

The current method for compare is in src/server/sql/reading/create_function_get_compare_readings.sql. The methods for line and bar are in src/server/sql/reading/create_compressed_reading_views.sql. I propose to change the compare reading functions to use the daily reading view for everything but the last day in each comparison. These values already average away missing points. The bar readings do this and could either be used directly or the idea copied. In the case of bar readings, there are always whole days so it can do this with only the daily readings. For compare, the last day may be partial. Thus, the value returned from summing the daily readings can be added to the values that are in the last day (except between 00:00:0-00:59:59 when there is not another hour in the next day to add in). Now that we have an hourly table that does the same average as the daily, we could use those values to get the correct average. This would be dependent on the site refreshing this view each hour so the data is available. The date/time stamp of the last reading in that day is needed so the correct label can be added to the comparison graph. The current day is done first so that date/time stamp can be used to get the previous period for the comparison day. Missing time points for the last day must be dealt with the same as they are for averaging daily points (via hourly view or averaging the readings if we do that). This method should implement idea 2.

If we use the readings directly and not the hourly view, we could easily go up to the last reading and not the last hour. Might as well do that if we use readings which are probably better if they are fast (hopefully since very few points) since won't depend on updating the hourly reading view which some site might delay if slow. This would remove the bias toward hours that is mostly gone from OED as we expanded the types of meters we read.

Jan 21 '22 14:01 huss