Compatibility with the Array API standard?
The Array API standard makes it possible to write generic code that works with many types of arrays (NumPy, Tensorflow, PyTorch, Dask, JAX, CuPy, MXNet, Xarray, ...). In principle, this would enable a pvlib simulation to be run in parallel, or on a GPU, using only code that looks very similar to the numpy-style code we are familiar with.
The Array API standard is gaining some meaningful traction in the community. For example, SciPy is adding support: see their array types tag, progress tracking tables, and developer documentation. scikit-learn is working on it too.
Adopting the standard could be a fairly straightforward way of enabling new use cases for pvlib. However, I don't have any experience with it myself and can't say anything more concrete than it seems worth looking into. So, following discussion with @echedey-ls, a good first step here is to explore the cost and benefit by prototyping compatibility in a small number of representative pvlib functions (I leave it an open question which functions might be considered representative). That would inform decisions about whether and how to pursue broader adoption across pvlib.
For reference, I copy this description from our GSoC 2025 project ideas list:
Project Title: Compatibility with the Array API
Project Description: An exciting development in the Python ecosystem is the Python array API standard, a uniform interface across the various array libraries (NumPy, Xarray, JAX, etc). Historically, pvlib has been written for compatibility with NumPy arrays and Pandas Series, but making use of the Array API to expand compatibility to other array libraries could be very beneficial. A successful proposal for this project will include some investigation about what it would take to adopt the Array API across pvlib.
Expected Outcomes: A plan/roadmap for adopting the Array API, Array API compatibility in several pvlib modules, and automated tests ensuring compatibility across the various array packages.
To the community: does this idea sound like it might be useful to your work? Please chime in!
To the community: does this idea sound like it might be useful to your work? Please chime in!
Yes very much so. I'd like to use polars more.
Unfortunately polars is not a part of this discussion. Unless I am mistaken, it is one of the few array libraries that does not support the Array API standard.
I'm in favor because I think that adopting the Array API standard makes pvlib functions more suitable for automatic differentiation.
Sounds promising! Beyond pandas, compatibility with xarray and plain numpy are useful to me currently.
Preliminary research for potential functions to test on; I've got all the numpy functions and usages per function everywhere in pvlib except for pvlib.iotools, that also includes private functions (that makes it harder to assign each usage to a public function, so I haven't tried solving that).
As an insight on the current status, this is an histogram of the number of functions (y) that have a number of unique, distinct numpy functions used in their source code (x):
The 14-different-numpy-methods function is pvlib.iam.marion_integrate.
There are many others in the top:
-
pvlib.pvarray.curve_fit, 12 -
pvlib.shading.masking_angle_passias,pvlib.singlediode._lambertw_v_from_i,pvlib.solarposition.ephemeris, 11 -
pvlib.pvsystem.sapm, 10
I'd choose pvlib.iam.marion_integrate. It has with np.errstate(invalid='ignore') and let's output a pd.Series if the input was so patterns. I find them relevant enough to review for this case.
On the other side there are a hella lot of em. I'd just go with a 3-numpies one, e.g. pvlib.atmosphere.windspeed_powerlaw.
- functions_analysis.py - code
- numpy_usage_analysis.csv - plain results
- numpy_usage_analysis.ods - results with steroids, graph and colors.
I did a little bit of reading, but it isn't very clear to me yet.
A compatible pvlib function would have to use the subset of common generic functions that is defined and available through the array API. Special strengths of individual back-end libraries could not be used. Also, pandas doesn't seem to be on the list of packages supporting the API, so that would require some extra thought.
It seems like supporting the API would require rebuilding pvlib from the bottom up, although you may not have to go all the way up and could just provide partial support.
Did I get any of that right?
@adriesse you're on the right track.
Special strengths of individual back-end libraries could not be used.
Yes, unique and non-API-conforming algorithms provided by each back-end lib can't be used. I guess pvlib mostly relies on common operations thou.
pandas doesn't seem to be on the list of packages supporting the API, so that would require some extra thought.
Bingo. Kevin told me about pandas.DataFrame.rolling, which seems a convoluted way to do a convolution. It has a nice feature thou, that it can use timedeltas as an input (in contrast to just the index-width of the window). There is also one use to calculate the wavelet function. I've never heard of it before, so IDK how feasible it is to Array-API it.
pandas.DataFrame.rolling usages (4)
pvlib-python> git grep -n \.rolling
pvlib/scaling.py:280: df = cs_long.rolling(window=intvlen, center=True, min_periods=1).mean()
pvlib/soiling.py:71: accum_rain = rainfall.rolling(rain_accum_period, closed='right').sum()
pvlib/soiling.py:185: accumulated_rainfall = rainfall.rolling(
pvlib/soiling.py:198: grace_windows = rain_events.rolling(grace_period, closed='right').sum() > 0