nowcasting_dataset Implement `PVPhysicsPredictionDataSource`

Implement `PVPhysicsPredictionDataSource`

Open JackKelly opened this issue 2 years ago • 2 comments

Detailed Description

For all timesteps, and for all PV systems in the region of interest, include:

Two sets of predicted PV power using pvlib's physical PV prediction. Use the PV system orientation metadata (if available).:
- Use NWP predictions (using an NWP init time at or before t0).
- Use clearsky
- ~~Maybe experiment with manually mapping from the inverter make and manufacturer in the metadata to pvlib's specifications.~~ UPDATE: I'm not sure the effort is worth payback.
~~The max actual PV power for each time of interest from the last 2 weeks.~~
- ~~Need to do some experimentation to check if 2 weeks is a good time. It might be better to find the max for a given sun angle.~~
- ~~This is useful for 2 reasons:~~
  - ~~To create a "shading-aware" physics based PV forecast: min(pvlib_forecast(t), max_pv_power_for_last_2_weeks(t)).~~
  - ~~actual_pv_power(t) / max_pv_power_for_last_2_weeks(t) should tell us what proportion of sunlight is being blocked by clouds.~~ UPDATE: I think the PV power production signal is too noisy to use simple approaches like this to model shading. Instead, I think we should train an ML model to handle shading.
The angle of the sun
The azimuth of the sun (unless the Sun data source already includes this information for each PV system). This data is useful for a simple ML model that takes the above inputs and estimates the residual of the pvlib's forecast for each PV system.
NWP variables for each PV system (maybe interpolated to 5 minutely)

Maybe use quite long history and forecast durations. Maybe 2 days of forecast and 2 days of history?

Also include:

The max actual PV power for the last 12 months (this is probably what we should use to rescale PV power to [0, 1]. Using the max across the entire timeseries won't capture panel degradation etc.)

Before building the data source, do some experiments in a Jupyter Notebook:

Try computing all of the above and see how well it performs as a PV forecast. If nothing else, this is all a useful baseline algorithm.
Try extending PVLib to consume UKV NWP.
Experiment with a simple (boosted regression tree?) model which predicts the residual.
can we reliably see shading from the last 2 weeks of data? What if the last two weeks was dull weather? Maybe better to compute a "shading plot" (a scatter plot of (actual PV power / expected PV power) vs solar angle) for multiple sun azimuth angles, and show this plot to an ML model. Or fit a curve to the shading plot.

Context

As discussed in https://github.com/openclimatefix/power_perceiver/issues/7, I'm now thinking of predicting PV as a chain of models, each of which predicts the residuals of the previous model.

Feb 20 '22 10:02 JackKelly

nowcasting_dataset nowcasting_dataset copied to clipboard

Implement `PVPhysicsPredictionDataSource`

Detailed Description

Context

nowcasting_dataset
nowcasting_dataset copied to clipboard