open-grid-emissions
open-grid-emissions copied to clipboard
Improve method for imputing wind and solar profiles using national averages
When a region is missing wind or solar profile data, we first attempt to impute wind and solar profiles based on averaging the wind or solar profile from directly-interconnected balancing authorities located in the same time zone as the BA in question. However, sometimes there are no directly-interconnected BAs in the same time zone, so we need to revert to a backup method for imputing these wind and solar profiles.
Our current method involves calculating an average profile based on all available wind and solar data nationally. When averaging these data, we currently use the local prevailing time to group the data (so for example, solar data for noon pacific time would be grouped with solar data for noon eastern time). This method performs reasonably well for solar (see https://github.com/singularity-energy/open-grid-emissions/issues/52) since generally the solar profile depends on the sun's local position in the sky. However, for wind this method performs much worse since wind patterns in different parts of the county may be quite different.
There are several conceptual questions that I'd be interested in answering to help improve this national imputation method.
Grouping on local prevailing time or local standard time?
We currently group data based on local prevailing time. However, because certain regions in the US (specifically Arizona) do not follow DST, this leads to some duplicate timestamp values during the transition hours between DST and standard time. It also means that any data from Arizona will be lagged behind all of the other data by an hour for half the year. We currently deal with the duplicate timestamp issue by simply dropping these duplicate timestamps. However, there are several more robust options for how to handle this:
- Convert all of the local prevailing time to local standard time across all timezones. The resulting averaged data would then need to be localized to local standard time and then converted to local prevailing time. This approach is probably the most robust, but would likely be computationally expensive.
- drop data for any timezones that don't follow DST so that all of the local timestamps are consistent. This approach assumes that the local timezone for which we are imputing these profiles follows DST.
Grouping on local time or UTC time
Grouping on local time seems to make sense for solar data, but it may not be the best approach for wind data. This partially depends on whether wind patterns in different regions tend to be driven mostly by local diurnal effects (e.g. wind is consistently stronger during certain times of day no matter where you are located), in which case using local time makes sense, or if wind patterns are more driven by continental weather/pressure patterns that influence all regions at the same time, no matter the local time of day (in which case using UTC time would be more appropriate).
Should we even use national imputation for wind profiles?
It seems that wind patterns might be pretty localized, and taking a national average wind profile (no matter how it is done) is not going to represent the local wind pattern in a specific region very well. If national imputation is not a robust method for estimating wind profiles for local areas, what would be alternative approaches that could work? As an example of a tough case for imputation, let's examine a state like Alaska or Hawaii, where no local wind data is reported, and there are no interconnected BAs, no other BAs even in the same timezone, and they are geographically isolated from the rest of the country. How could we come up with a reasonable estimate for these regions?
One option would be to develop a library of typical regional wind patterns for these regions based on historical data (using a tool like PySAM). These patterns would not necessarily be correlated with the wind and solar in each specific year, but would in theory represent what a typical wind pattern would look like in this region. If taking this approach, it would probably make sense to use a month-hour average value, and to take an average or P50 value for multiple years.
In addition to examining national profiles, we should evaluate how well modeled/TMY wind and solar profiles perform compared to the current DIBA imputation method.
This paper includes some good methods/datasets, especially for wind: https://bescdn.breakthroughenergy.org/publications/US_Test_System_with_High_Spatial_and_Temporal_Resolution.pdf (see III.D)