pvlib-python icon indicating copy to clipboard operation
pvlib-python copied to clipboard

Tools for Generating Synthetic Irradiance Timeseries

Open jranalli opened this issue 9 months ago • 7 comments

In my solarspatialtools library, I made an implementation of Lave et al's synthetic irradiance approach based on stacked multi-scale random cloud fields.

I wanted to offer to contribute it over here if people would be interested, because my philosophy for that project is to rely on pvlib where possible since it's much more mature and widely adopted, and only to house things that are out of pvlib's scope. In this case, it might be relevant to expansion of the scaling package in pvlib, but no offense taken if this is too far afield. This code is pretty involved (600+ lines), so it would be a big review, but is relatively self-contained.

jranalli avatar Mar 27 '25 14:03 jranalli

+1 if it can scale monthly to hourly. What inputs does it require? Any sense of expected downsides? Eg if used to scale hourly to 5-minute what difference versus ground should I be concerned about?

mikofski avatar Mar 27 '25 14:03 mikofski

This isn't my method, so I don't want to misrepresent, but here's my take.

It's fully synthetic, and more geared for generating high frequency data than monthly->hourly. The most direct application is creating spatially distinct timeseries based on a single high frequency source (e.g. taking an irradiance sensor measurement and building out to give you an idea of what the statistically similar time series would look like in spatial variability across a plant). It could also be used for temporal downscaling, but again with a focus on high frequency rather than low.

Inputs are:

  • magnitudes of the wavelet modes
  • the size of the field you'd like to simulate
  • the fraction of clearsky
  • some statistics of clearsky index that you'd like to be reflected in the generated time series (mean, minimum, maximum).

So it works best when you have a time series on which to base the wavelet modes, but if you had your own ideas about what those should be you could specify them manually. Applying it spatially requires that you also now a cloud motion vector for how the temporal and spatial transport are related.

Here's a single demo of how the PDFs and CDFs of clear sky index are comparing real- and the synthetic data. Image

jranalli avatar Mar 27 '25 15:03 jranalli

I think this method is insufficiently validated for inclusion in pvlib, in particular, because its likely use in pvlib is to extrapolate a point irradiance measurement to a field of irradiance and to accept each field point as realistic. This is beyond the aims of the developers, and the paper admits as much (emphasis added):

"However, when timeseries are sampled at hundreds of locations (corresponding to the hundreds of different transformer locations on the feeder), the aggregate output is much smoother and looks more realistic. As described in section Error! Reference source not found., we have ongoing test to evaluate the need for accurate distributed PV inputs. For analysis such as voltage regulator tap changes, it may not be important that a single customer be accurately portrayed because the regulator will only see the aggregate output of several PV systems."

cwhanse avatar Mar 28 '25 15:03 cwhanse

I certainly agree with that description of the method's validation and can see the potential for misinterpretation of what the spatial field really represents.

jranalli avatar Mar 28 '25 17:03 jranalli

Hi @jranalli I might have an alternate method for synthesizing high frequency data, I think it’s an implementation of a popular algorithm. I’ll send it to you to see what you think.

mikofski avatar Mar 29 '25 00:03 mikofski

Sure I'd love to see it. I'm definitely up for seeing a variety of documented methods. I think a comparative validation of some of the downscaling and/or synthetic irradiance methods would be something useful for a future study.


From: Mark Mikofski @.> Sent: Friday, March 28, 2025 8:34 PM To: pvlib/pvlib-python @.> Cc: Ranalli, Joseph @.>; Mention @.> Subject: Re: [pvlib/pvlib-python] Tools for Generating Synthetic Irradiance Timeseries (Issue #2422)

Hi @jranallihttps://github.com/jranalli I might have an alternate method for synthesizing high frequency data, I think it’s an implementation of a popular algorithm. I’ll send it to you to see what you think.

— Reply to this email directly, view it on GitHubhttps://github.com/pvlib/pvlib-python/issues/2422#issuecomment-2762939710, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4ABDGUECQEH3FBJ6AAVBT2WXTAFAVCNFSM6AAAAABZ5FJCO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRSHEZTSNZRGA. You are receiving this because you were mentioned.Message ID: @.***>

[mikofski]mikofski left a comment (pvlib/pvlib-python#2422)https://github.com/pvlib/pvlib-python/issues/2422#issuecomment-2762939710

Hi @jranallihttps://github.com/jranalli I might have an alternate method for synthesizing high frequency data, I think it’s an implementation of a popular algorithm. I’ll send it to you to see what you think.

— Reply to this email directly, view it on GitHubhttps://github.com/pvlib/pvlib-python/issues/2422#issuecomment-2762939710, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4ABDGUECQEH3FBJ6AAVBT2WXTAFAVCNFSM6AAAAABZ5FJCO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONRSHEZTSNZRGA. You are receiving this because you were mentioned.Message ID: @.***>

jranalli avatar Mar 29 '25 14:03 jranalli

HI Joe, the method I have was written by Patrick Mathiesen based on:

This code is probably proprietary, but I can post the other references and see if we can piece it together without reading the source code. That would be better for pvlib as referenced implementations anyway.

As I understand it, the statistics are extracted from a climatically similar high frequency dataset with sufficient duration to characterize the statistics. Then Markov chains are used to generate hourly data from the monthly totals that matches the statistics of the reference high frequency data and the monthly total of the provided data set. I'm sure there's more nuance to it than that, and sorry if I am ignorantly stating the obvious.

mikofski avatar Apr 01 '25 18:04 mikofski