eemeter icon indicating copy to clipboard operation
eemeter copied to clipboard

Constant (average) counterfactual with eemeter daily matrix on certain datasets only.

Open jfenna opened this issue 2 years ago • 2 comments

Hi,

I'm using eemeter for a research project comparing reliability of metered savings from hourly and daily gas consumption data. I've been able to generate varying hourly counterfactuals for a set of publicly available data; however, I'm having trouble generating a varying counterfactual for daily consumption with some datasets. Instead of giving me a counterfactual that varies with temperature, I'm getting the average of the baseline meter period.

Have you ever come across an issue like this? Is this just a data issue, or is there an issue with the model?

Output from Dataset 1 (LCL-June2015v2_126)

Summary statistics from baseline period included for reference.

            value
count  385.000000
mean     1.341932
std      0.924915
min      0.000000
25%      0.708000
50%      1.299000
75%      1.908000
max      7.219000


                          reporting_observed  counterfactual_usage
2013-01-21 00:00:00+00:00               1.560              1.642287
2013-01-22 00:00:00+00:00               3.207              1.628237
2013-01-23 00:00:00+00:00               1.796              1.598183
2013-01-24 00:00:00+00:00               2.400              1.610889
2013-01-25 00:00:00+00:00               1.746              1.610034
2013-01-26 00:00:00+00:00               2.336              1.497270
2013-01-27 00:00:00+00:00               2.314              1.408208
2013-01-28 00:00:00+00:00               1.914              1.459275
2013-01-29 00:00:00+00:00               1.635              1.304730
2013-01-30 00:00:00+00:00               0.000              1.352743

Output from Dataset 2 (LCL-June2015v2_0)

Summary statistics from baseline period included for reference.

            value
count  385.000000
mean     5.912592
std      2.664848
min      0.000000
25%      5.144000
50%      6.102000
75%      6.922000
max     23.399000

                        reporting_observed  counterfactual_usage
2013-01-21 00:00:00+00:00               6.083              5.912592
2013-01-22 00:00:00+00:00               5.715              5.912592
2013-01-23 00:00:00+00:00               6.080              5.912592
2013-01-24 00:00:00+00:00               6.491              5.912592
2013-01-25 00:00:00+00:00               4.954              5.912592
2013-01-26 00:00:00+00:00               8.271              5.912592
2013-01-27 00:00:00+00:00               6.022              5.912592
2013-01-28 00:00:00+00:00               5.305              5.912592
2013-01-29 00:00:00+00:00               4.802              5.912592
2013-01-30 00:00:00+00:00               0.000              5.912592

jfenna avatar Jun 28 '22 18:06 jfenna

Hi @jfenna - What you're seeing here is the "base load only" model being selected. Baseload-only is just that - a flat line with a constant counterfactual (which may not be exactly the average, since it's a line of best fit with a single parameter). For gas, the eemeter by default tries out the heating-only and baseload-only models and takes the one with the best fit (by highest r-squared).

image

philngo avatar Jun 28 '22 18:06 philngo

Hi @philngo - thanks for clarifying so quickly, this is really helpful. If I wanted to compare the heating-only and baseload-only models, are you able to suggest a straightforward way to do so?

J

jfenna avatar Jun 29 '22 07:06 jfenna

A new daily model has been released. It has many bug and feature improvements unfortunately fitting these models individually is not something that is done anymore because we use a lasso regression inspired penalization to automatically select the best of these 4 models.

travis-recurve avatar Mar 19 '24 19:03 travis-recurve