nowcasting_dataset icon indicating copy to clipboard operation
nowcasting_dataset copied to clipboard

Implement `PVPhysicsPredictionDataSource`

Open JackKelly opened this issue 2 years ago • 2 comments

Detailed Description

For all timesteps, and for all PV systems in the region of interest, include:

  • Two sets of predicted PV power using pvlib's physical PV prediction. Use the PV system orientation metadata (if available).:
    • Use NWP predictions (using an NWP init time at or before t0).
    • Use clearsky
    • ~~Maybe experiment with manually mapping from the inverter make and manufacturer in the metadata to pvlib's specifications.~~ UPDATE: I'm not sure the effort is worth payback.
  • ~~The max actual PV power for each time of interest from the last 2 weeks.~~
    • ~~Need to do some experimentation to check if 2 weeks is a good time. It might be better to find the max for a given sun angle.~~
    • ~~This is useful for 2 reasons:~~
      • ~~To create a "shading-aware" physics based PV forecast: min(pvlib_forecast(t), max_pv_power_for_last_2_weeks(t)).~~
      • ~~actual_pv_power(t) / max_pv_power_for_last_2_weeks(t) should tell us what proportion of sunlight is being blocked by clouds.~~ UPDATE: I think the PV power production signal is too noisy to use simple approaches like this to model shading. Instead, I think we should train an ML model to handle shading.
  • The angle of the sun
  • The azimuth of the sun (unless the Sun data source already includes this information for each PV system). This data is useful for a simple ML model that takes the above inputs and estimates the residual of the pvlib's forecast for each PV system.
  • NWP variables for each PV system (maybe interpolated to 5 minutely)

Maybe use quite long history and forecast durations. Maybe 2 days of forecast and 2 days of history?

Also include:

  • The max actual PV power for the last 12 months (this is probably what we should use to rescale PV power to [0, 1]. Using the max across the entire timeseries won't capture panel degradation etc.)

Before building the data source, do some experiments in a Jupyter Notebook:

  • Try computing all of the above and see how well it performs as a PV forecast. If nothing else, this is all a useful baseline algorithm.
  • Try extending PVLib to consume UKV NWP.
  • Experiment with a simple (boosted regression tree?) model which predicts the residual.
  • can we reliably see shading from the last 2 weeks of data? What if the last two weeks was dull weather? Maybe better to compute a "shading plot" (a scatter plot of (actual PV power / expected PV power) vs solar angle) for multiple sun azimuth angles, and show this plot to an ML model. Or fit a curve to the shading plot.

Context

As discussed in https://github.com/openclimatefix/power_perceiver/issues/7, I'm now thinking of predicting PV as a chain of models, each of which predicts the residuals of the previous model.

JackKelly avatar Feb 20 '22 10:02 JackKelly