pvlib-python
pvlib-python copied to clipboard
More descriptive errors for `detect_clearsky`
Is your feature request related to a problem? Please describe.
Using a window length that is too short relative to data period in detect_clearsky
produces cryptic errors
Describe the solution you'd like Raise a ValueError that directly explains the problem
Example:
import pvlib
import pandas as pd
import numpy as np
start = '2012-01-01'
end = '2015-01-01'
freq = '60T'
times = pd.date_range(start=start, end=end, freq=freq)
x1 = pd.Series(np.random.rand(len(times)), index=times)
x2 = pd.Series(np.random.rand(len(times)), index=times)
pvlib.clearsky.detect_clearsky(x1, x2,
window_length=90, mean_diff=75, max_diff=75,
lower_line_length=-45, upper_line_length=80,
var_diff=0.032, slope_dev=75)
Results in the following:
ValueError Traceback (most recent call last)
Cell In[28], line 11
8 x1 = pd.Series(np.random.rand(len(times)), index=times)
9 x2 = pd.Series(np.random.rand(len(times)), index=times)
---> 11 pvlib.clearsky.detect_clearsky(x1, x2,
12 window_length=90, mean_diff=75, max_diff=75,
13 lower_line_length=-45, upper_line_length=80,
14 var_diff=0.032, slope_dev=75)
File ~/opt/anaconda3/envs/rdtools3_testing/lib/python3.10/site-packages/pvlib/clearsky.py:854, in detect_clearsky(measured, clearsky, times, infer_limits, window_length, mean_diff, max_diff, lower_line_length, upper_line_length, var_diff, slope_dev, max_iterations, return_components)
850 clear_line_length = _line_length_windowed(
851 scaled_clear, H, samples_per_window, sample_interval)
853 line_diff = meas_line_length - clear_line_length
--> 854 slope_max_diff = _max_diff_windowed(
855 meas - scaled_clear, H, samples_per_window)
856 # evaluate comparison criteria
857 c1 = np.abs(meas_mean - alpha*clear_mean) < mean_diff
File ~/opt/anaconda3/envs/rdtools3_testing/lib/python3.10/site-packages/pvlib/clearsky.py:602, in _max_diff_windowed(data, H, samples_per_window)
600 def _max_diff_windowed(data, H, samples_per_window):
601 raw = np.diff(data)
--> 602 raw = np.abs(raw[H[:-1, ]]).max(axis=0)
603 return _to_centered_series(raw, data.index, samples_per_window)
File ~/opt/anaconda3/envs/rdtools3_testing/lib/python3.10/site-packages/numpy/core/_methods.py:40, in _amax(a, axis, out, keepdims, initial, where)
38 def _amax(a, axis=None, out=None, keepdims=False,
39 initial=_NoValue, where=True):
---> 40 return umr_maximum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation maximum which has no identity
And the following produces a different error, even though the window_length
> period as required by the docstring (Does the window length need to be a multiple of the period?)
import pvlib
import pandas as pd
import numpy as np
start = '2012-01-01'
end = '2015-01-01'
freq = '60T'
times = pd.date_range(start=start, end=end, freq=freq)
x1 = pd.Series(np.random.rand(len(times)), index=times)
x2 = pd.Series(np.random.rand(len(times)), index=times)
pvlib.clearsky.detect_clearsky(x1, x2,
window_length=175, mean_diff=75, max_diff=75,
lower_line_length=-45, upper_line_length=80,
var_diff=0.032, slope_dev=75)
result:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[32], line 11
8 x1 = pd.Series(np.random.rand(len(times)), index=times)
9 x2 = pd.Series(np.random.rand(len(times)), index=times)
---> 11 pvlib.clearsky.detect_clearsky(x1, x2,
12 window_length=175, mean_diff=75, max_diff=75,
13 lower_line_length=-45, upper_line_length=80,
14 var_diff=0.032, slope_dev=75)
File ~/opt/anaconda3/envs/rdtools3_testing/lib/python3.10/site-packages/pvlib/clearsky.py:891, in detect_clearsky(measured, clearsky, times, infer_limits, window_length, mean_diff, max_diff, lower_line_length, upper_line_length, var_diff, slope_dev, max_iterations, return_components)
885 except AttributeError:
886 message = "Optimizer exited unsuccessfully: \
887 No message explaining the failure was returned. \
888 If you would like to see this message, please \
889 update your scipy version (try version 1.8.0 \
890 or beyond)."
--> 891 raise RuntimeError(message)
893 else:
894 alpha = optimize_result.x
RuntimeError: Optimizer exited unsuccessfully: NaN result encountered.
Thanks @mdeceglie. I've had on my list for a while to implement this extension of the detect clearsky algorithm. In the meantime, PRs welcome to improve the error detection/docstrings.
@cwhanse do you have insights into the second example? Is the requirement specifically that the window length must be greater than the period and a multiple of the period? I am speculating on the nature of that problem.
In the second case: the window contains 2 data values. Buried in the detect_clearsky algorithm is the calculation of a sample standard deviation of slopes between points in an interval. With window length 175 and data frequency of 60, there are two points per interval, hence one slope, hence the divisor in the standard deviation is N-1 = 0. That returns nan for the standard deviation which then results in no clear points detected, and subsequent failure of the optimizer.
I think the fix is either to edit the docstring for window_length
to say "at least 3 periods". Or, we can build a fallback into that calculation of slope std. dev. in the case where there's only 1 slope per interval.