lightkurve
lightkurve copied to clipboard
Restore `test_bin` and `test_bin_quality`
I temporarily skipped the tests test_bin
and test_bin_quality
over in test_lightcurve.py
. These tests pass fine (locally) using SciPy v1.3.x and earlier, but the newly-released SciPy v1.4.0 has a change which causes binning to fail in the presence of NaNs (cf. https://github.com/scipy/scipy/issues/11365).
I'm opening this issue to make sure we re-enable these two tests as soon as the underlying SciPy issue is resolved.
Over in #707 I have updated requirements.txt
to exclude SciPy v1.4.0 and v1.4.1 for now. Fingers crossed the issue will be fixed in v1.4.2...
@barentsen The scipy folks said version "1.5".
I did this: pip3 install git+"https://github.com/scipy/scipy"
Collecting git+https://github.com/scipy/scipy
Cloning https://github.com/scipy/scipy to /tmp/pip-req-build-xphlmq9t
Running command git clone -q https://github.com/scipy/scipy /tmp/pip-req-build-xphlmq9t
Running command git submodule update --init --recursive -q
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Building wheels for collected packages: scipy
Building wheel for scipy (PEP 517) ... done
Created wheel for scipy: filename=scipy-1.6.0.dev0+a49e960-cp38-cp38-linux_x86_64.whl size=62739928 sha256=6d70207fee0517b89c5f5c9153b1ba9e8e2508df8c59709742d207371fb7db3a
Stored in directory: /tmp/pip-ephem-wheel-cache-fqj7cyhd/wheels/d1/2c/5a/72e8bc20052944163183e7301df53307818e86bd2b709619b0
Successfully built scipy
Installing collected packages: scipy
Attempting uninstall: scipy
Found existing installation: scipy 1.3.3
Uninstalling scipy-1.3.3:
Successfully uninstalled scipy-1.3.3
Successfully installed scipy-1.6.0.dev0+a49e960
Now, your test program from the issue report i.e.
import scipy.stats, numpy as np
x = [0.5, 0.5, 1.5, np.nan]
values = [10, 20, 30, 40]
scipy.stats.binned_statistic(x, values, bins=3)
gets this result:
Traceback (most recent call last):
File "/home/elkins/lk.py", line 4, in <module>
scipy.stats.binned_statistic(x, values, bins=3)
File "/home/elkins/.local/lib/python3.8/site-packages/scipy/stats/_binned_statistic.py", line 182, in binned_statistic
medians, edges, binnumbers = binned_statistic_dd(
File "/home/elkins/.local/lib/python3.8/site-packages/scipy/stats/_binned_statistic.py", line 532, in binned_statistic_dd
raise ValueError('%r contains non-finite values.' % (sample,))
ValueError: [[0.5, 0.5, 1.5, nan]] contains non-finite values.
Does this make more sense? Change the Nan to 42 and no exception is thrown.
What do you think scipy.stats.binned_statistic
should do with Nans? The API doesn't say anything about Nan input values: "A sequence of values to be binned." To me, that implies finite values.
Anyways, installing lightkurve 2.0.dev0 worked with no nasty messages nor scipy rollbacks.
@texadactyl Thanks for reminding me of this issue!! :+1:
v1.5.0rc1 appears to be available on pypi:
$ pip install scipy==1.5.0rc1
Using this version, I find that values
is now allowed to contain NaNs, which resolves the issue we encountered:
In [1]: import scipy.stats, numpy as np
...: x = [0.5, 0.5, 1.5, 1.5]
...: values = [10, 20, 30, np.nan]
...: scipy.stats.binned_statistic(x, values, bins=3)
Out[1]: BinnedStatisticResult(statistic=array([15., nan, nan]), bin_edges=array([0.5 , 0.83333333, 1.16666667, 1.5 ]), binnumber=array([1, 1, 3, 3]))
If we decide to merge #744 for Lightkurve v2.0 however, then we will start using astropy.timeseries.aggregate_downsample
instead of scipy.stats.binned_statistic
, in which case this issue is no longer relevant. Let's wait a week to see what happens with #744!
FYI, the issue appears to continue to exist in the latest master
I can't tell if the issue appeared in CI test. In my local build, the test continued to fail.
@pytest.mark.xfail
def test_bin():
"""Does binning work?"""
with warnings.catch_warnings(): # binsize is deprecated
warnings.simplefilter("ignore", LightkurveDeprecationWarning)
lc = LightCurve(time=np.arange(10),
flux=2*np.ones(10),
flux_err=2**.5*np.ones(10))
binned_lc = lc.bin(binsize=2)
assert_allclose(binned_lc.flux, 2*np.ones(5))
> assert_allclose(binned_lc.flux_err, np.ones(5))
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=0
E
E Mismatch: 40%
E Max absolute difference: 0.41421356
E Max relative difference: 0.41421356
E x: Quantity([0.816497, 1. , 1. , 1. , 1.414214], unit='')
E y: array([1., 1., 1., 1., 1.])
test_lightcurve.py:422: AssertionError
The local environment: python 3.8.5 on Windows; scipy 1.5.0
In the scipy 1.5.0 installed, it appears that NaN
continued to pose problems:
import scipy.stats, numpy as np
x = [0.5, 0.5, 1.5, np.nan]
values = [10, 20, 30, 40]
scipy.stats.binned_statistic(x, values, bins=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\pkg\_winNonPortables\Anaconda3\envs\base_dev_py38_jupyter\lib\site-packages\scipy\stats\_binned_statistic.py", line 182, in binned_statistic
medians, edges, binnumbers = binned_statistic_dd(
File "C:\pkg\_winNonPortables\Anaconda3\envs\base_dev_py38_jupyter\lib\site-packages\scipy\stats\_binned_statistic.py", line 532, in binned_statistic_dd
raise ValueError('%r contains non-finite values.' % (sample,))
ValueError: [[0.5, 0.5, 1.5, nan]] contains non-finite values.
@barentsen I think this issue can be closed - at current main
, bin tests are no longer skipped nor failing.