caltrack
caltrack copied to clipboard
CalTRACK Issue: Statistical Measures for CalTRACK Hourly Method
Problem statement
Hi, I am implementing the CalTRACK Hourly Method for some M&V use cases. I am referring to the documentation here: https://docs.caltrack.org/en/latest/methods.html.
I have some questions regarding implementing the statistic measures in Section 4.3 for the Hourly Model. The two measures I am focussing on are CV(RMSE) and FSU, defined as the following in the documentation:
and
I have the following questions regarding calculating these for the Hourly Model:
- The documentation provides values for the empirical coefficients for the "billing" and "daily" models as
What should be the values used for the hourly method?
- The values for "P" and "c" can be different over the 12 monthly models under the Hourly method. When reporting the CV (RMSE) and FSU over a multi-month reporting period, does CalTRACK recommend a way to aggregate these values?
- Lastly, I wanted to confirm that the total no of periods "P" for monthly model would correspond to the total number of hours in the month.
- In the Sun and Baltazar ASHRAE conference paper they only determine these improved equations for monthly and daily interval data, so the best that can be done is to use the constant 1.26 in place of the polynomial.
- FSU is a normalized uncertainty metric so you could multiply it by the monthly savings, resulting in the uncertainty of each model for the month. These could then be added together in quadrature and then divide by the total savings for the year to get the FSU for the year.
- In the case of the billing/monthly model, P would the number months.
Hi Travis, Thank you so much for answering the questions. For question 3, I think I might have been unclear. What would be the value of P for the hourly method?
P is the number of data points in the baseline period. 8760 minus however many hours you are missing. P' is the effective number of data points taking into consideration a lag 1 autocorrelation. Q is the number of data points in the reporting period.
Once again, thanks Travis. That answers more things for me. One question remains though - the hourly method requires training 12 models, one for every month of the year. Now each of these models will have its own set of explanatory parameters, which will vary in values and number of parameters. Consequently, c (no of parameters) will be different over different months.
And since each model is trained over different datasets ( 3 calendar months), P and P' will be different over the 12 models as well (not 8760 I suppose).
Hence, when getting FSU values for a certain month in the reporting period, should we be using the c, P, and correspondingly t, P', corresponding to the model of that month, right? Then we can use the strategy you recommended to get an FSU over multiple months
FSU is a normalized uncertainty metric so you could multiply it by the monthly savings, resulting in the uncertainty of each model for the month. These could then be added together in quadrature and then divide by the total savings for the year to get the FSU for the year.
I also realize that the degrees of freedom defined by (P-c-1) will be greater than 100 across all the 12 monthly models. So we could potentially use the same t=1.65, but CVRMSE will still change from month to month.
You are correct. It would be 3 calendar months, but 2 of those months are weighted at 50% so effectively it's 2 months. I'm not totally sure what the proper way to handle this is to be honest. It might be easier to just leave it at the 1 month being modeled at a time and call it good enough. If you didn't then how do you deal with P' when you're only predicting on 1 month? I guess you could include the same months that it was built on, but that seems strange to me.