lightkurve icon indicating copy to clipboard operation
lightkurve copied to clipboard

Restore `test_bin` and `test_bin_quality`

Open barentsen opened this issue 5 years ago • 5 comments

I temporarily skipped the tests test_bin and test_bin_quality over in test_lightcurve.py. These tests pass fine (locally) using SciPy v1.3.x and earlier, but the newly-released SciPy v1.4.0 has a change which causes binning to fail in the presence of NaNs (cf. https://github.com/scipy/scipy/issues/11365).

I'm opening this issue to make sure we re-enable these two tests as soon as the underlying SciPy issue is resolved.

barentsen avatar Jan 16 '20 18:01 barentsen

Over in #707 I have updated requirements.txt to exclude SciPy v1.4.0 and v1.4.1 for now. Fingers crossed the issue will be fixed in v1.4.2...

barentsen avatar Mar 07 '20 02:03 barentsen

@barentsen The scipy folks said version "1.5".

I did this: pip3 install git+"https://github.com/scipy/scipy"

Collecting git+https://github.com/scipy/scipy
  Cloning https://github.com/scipy/scipy to /tmp/pip-req-build-xphlmq9t
  Running command git clone -q https://github.com/scipy/scipy /tmp/pip-req-build-xphlmq9t
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: scipy
  Building wheel for scipy (PEP 517) ... done
  Created wheel for scipy: filename=scipy-1.6.0.dev0+a49e960-cp38-cp38-linux_x86_64.whl size=62739928 sha256=6d70207fee0517b89c5f5c9153b1ba9e8e2508df8c59709742d207371fb7db3a
  Stored in directory: /tmp/pip-ephem-wheel-cache-fqj7cyhd/wheels/d1/2c/5a/72e8bc20052944163183e7301df53307818e86bd2b709619b0
Successfully built scipy
Installing collected packages: scipy
  Attempting uninstall: scipy
    Found existing installation: scipy 1.3.3
    Uninstalling scipy-1.3.3:
      Successfully uninstalled scipy-1.3.3
Successfully installed scipy-1.6.0.dev0+a49e960

Now, your test program from the issue report i.e.

import scipy.stats, numpy as np
x = [0.5, 0.5, 1.5, np.nan]
values = [10, 20, 30, 40]
scipy.stats.binned_statistic(x, values, bins=3)

gets this result:

Traceback (most recent call last):

  File "/home/elkins/lk.py", line 4, in <module>
    scipy.stats.binned_statistic(x, values, bins=3)

  File "/home/elkins/.local/lib/python3.8/site-packages/scipy/stats/_binned_statistic.py", line 182, in binned_statistic
    medians, edges, binnumbers = binned_statistic_dd(

  File "/home/elkins/.local/lib/python3.8/site-packages/scipy/stats/_binned_statistic.py", line 532, in binned_statistic_dd
    raise ValueError('%r contains non-finite values.' % (sample,))

ValueError: [[0.5, 0.5, 1.5, nan]] contains non-finite values.

Does this make more sense? Change the Nan to 42 and no exception is thrown.

What do you think scipy.stats.binned_statistic should do with Nans? The API doesn't say anything about Nan input values: "A sequence of values to be binned." To me, that implies finite values.

Anyways, installing lightkurve 2.0.dev0 worked with no nasty messages nor scipy rollbacks.

texadactyl avatar Jun 09 '20 19:06 texadactyl

@texadactyl Thanks for reminding me of this issue!! :+1:

v1.5.0rc1 appears to be available on pypi:

$ pip install scipy==1.5.0rc1

Using this version, I find that values is now allowed to contain NaNs, which resolves the issue we encountered:

In [1]: import scipy.stats, numpy as np 
   ...: x = [0.5, 0.5, 1.5, 1.5] 
   ...: values = [10, 20, 30, np.nan] 
   ...: scipy.stats.binned_statistic(x, values, bins=3)                                                                       
Out[1]: BinnedStatisticResult(statistic=array([15., nan, nan]), bin_edges=array([0.5       , 0.83333333, 1.16666667, 1.5       ]), binnumber=array([1, 1, 3, 3]))

If we decide to merge #744 for Lightkurve v2.0 however, then we will start using astropy.timeseries.aggregate_downsample instead of scipy.stats.binned_statistic, in which case this issue is no longer relevant. Let's wait a week to see what happens with #744!

barentsen avatar Jun 10 '20 20:06 barentsen

FYI, the issue appears to continue to exist in the latest master

I can't tell if the issue appeared in CI test. In my local build, the test continued to fail.

@pytest.mark.xfail
    def test_bin():
        """Does binning work?"""
        with warnings.catch_warnings():  # binsize is deprecated
            warnings.simplefilter("ignore", LightkurveDeprecationWarning)
    
            lc = LightCurve(time=np.arange(10),
                            flux=2*np.ones(10),
                            flux_err=2**.5*np.ones(10))
            binned_lc = lc.bin(binsize=2)
            assert_allclose(binned_lc.flux, 2*np.ones(5))
>           assert_allclose(binned_lc.flux_err, np.ones(5))
E           AssertionError: 
E           Not equal to tolerance rtol=1e-07, atol=0
E           
E           Mismatch: 40%
E           Max absolute difference: 0.41421356
E           Max relative difference: 0.41421356
E            x: Quantity([0.816497, 1.      , 1.      , 1.      , 1.414214], unit='')
E            y: array([1., 1., 1., 1., 1.])

test_lightcurve.py:422: AssertionError

The local environment: python 3.8.5 on Windows; scipy 1.5.0

In the scipy 1.5.0 installed, it appears that NaN continued to pose problems:

import scipy.stats, numpy as np
x = [0.5, 0.5, 1.5, np.nan]
values = [10, 20, 30, 40]
scipy.stats.binned_statistic(x, values, bins=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\pkg\_winNonPortables\Anaconda3\envs\base_dev_py38_jupyter\lib\site-packages\scipy\stats\_binned_statistic.py", line 182, in binned_statistic
    medians, edges, binnumbers = binned_statistic_dd(
  File "C:\pkg\_winNonPortables\Anaconda3\envs\base_dev_py38_jupyter\lib\site-packages\scipy\stats\_binned_statistic.py", line 532, in binned_statistic_dd
    raise ValueError('%r contains non-finite values.' % (sample,))
ValueError: [[0.5, 0.5, 1.5, nan]] contains non-finite values.

orionlee avatar Sep 16 '20 05:09 orionlee

@barentsen I think this issue can be closed - at current main, bin tests are no longer skipped nor failing.

orionlee avatar Apr 25 '21 17:04 orionlee