reproject icon indicating copy to clipboard operation
reproject copied to clipboard

Flag out NaN weights in coadd & allow for weights to have different WCS than data

Open keflavich opened this issue 1 year ago • 25 comments

I encountered a severe error case in which the weights array became NaN after reprojection. Flagging out locations where weights is NaN solved the issue.

I cannot come up with an MWE; this occurs deep in the guts of a complicated mosaicking attempt, and it happened for only 1 field out of 270. I think it might be the consequence of a numerical error that occurs only under unique conditions, but it also looks logically like this fix should work.

keflavich avatar Oct 01 '24 03:10 keflavich

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 91.46%. Comparing base (0ef4287) to head (b7b0c6f). :warning: Report is 136 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #474      +/-   ##
==========================================
+ Coverage   91.00%   91.46%   +0.45%     
==========================================
  Files          25       25              
  Lines        1078     1089      +11     
==========================================
+ Hits          981      996      +15     
+ Misses         97       93       -4     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Oct 01 '24 03:10 codecov[bot]

Currently broken:

FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-False-arrays] - TypeError: input_data should either be an HDU object or a tuple of (array, WCS) or (array, Header)
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-False-filenames] - ValueError: Output WCS has celestial components but input WCS does not
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-False-hdus] - ValueError: Output WCS has celestial components but input WCS does not
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-True-arrays] - TypeError: input_data should either be an HDU object or a tuple of (array, WCS) or (array, Header)
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-True-filenames] - ValueError: Output WCS has celestial components but input WCS does not
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[interp-True-hdus] - ValueError: Output WCS has celestial components but input WCS does not
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-False-arrays] - TypeError: input_data should either be an HDU object or a tuple of (array, WCS) or (array, Header)
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-False-filenames] - NotImplementedError: Currently only data with a 2-d celestial WCS can be reprojected using flux-conserving algorithm
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-False-hdus] - NotImplementedError: Currently only data with a 2-d celestial WCS can be reprojected using flux-conserving algorithm
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-True-arrays] - TypeError: input_data should either be an HDU object or a tuple of (array, WCS) or (array, Header)
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-True-filenames] - NotImplementedError: Currently only data with a 2-d celestial WCS can be reprojected using flux-conserving algorithm
FAILED reproject/mosaicking/tests/test_coadd.py::TestReprojectAndCoAdd::test_coadd_with_weights[exact-True-hdus] - NotImplementedError: Currently only data with a 2-d celestial WCS can be reprojected using flux-conserving algorithm
FAILED reproject/mosaicking/tests/test_coadd.py::test_coadd_solar_map - TypeError: input_data should either be an HDU object or a tuple of (array, WCS) or (array, Header)

keflavich avatar Jan 25 '25 16:01 keflavich

Tests now pass locally (w/o sunpy installed...)

keflavich avatar Jan 25 '25 16:01 keflavich

@astrofrog could you review this? There are several critical fixes for spectral-cube's cube mosaicking.

Major changes are:

  • Weights are no longer assumed to be on the same grid as the data, and they can bring their own WCS
  • input and output array dtypes are not assumed to be the same; input is swapped to match output (under the assumption that that's cheaper - but perhaps this needs to be made optional / documented)
  • bad weights are removed, which was the original point of this PR, but might have been a side-effect of the first

keflavich avatar Jan 26 '25 15:01 keflavich

For coverage checks: Do we have a dask fits reader, or could we add that in? The byte swapping does get tested in spectral-cube tests, since those sometimes load different endian arrays, but I don't see a simple way to do that here.

keflavich avatar Jan 26 '25 18:01 keflavich

in case the input isn't a dask array

Technically I think this might be needed for numpy arrays too, but the FITS reader for numpy arrays already handles that?

In any case, I'll report the issues and see if we can move the byteswapping. I haven't worked up a minimal example for the dask byteswapped data; iirc, it was any dask array loaded with a fits reader

keflavich avatar Jan 30 '25 12:01 keflavich

@keflavich - just to check though, why is the byteswapping needed at all? Where does it crash if we don't byteswap?

astrofrog avatar Jan 30 '25 12:01 astrofrog

it doesn't crash. It interprets the data as having the wrong endianness and all values end up as 2e-205 and similar.

keflavich avatar Jan 30 '25 13:01 keflavich

Ok interesting, is that making use of reproject_interp?

astrofrog avatar Jan 30 '25 14:01 astrofrog

yes

keflavich avatar Jan 30 '25 14:01 keflavich

so, I didn't figure out how to build a MWE within reproject, but if you comment out the _byteordermatch call and run the spectral cube tests, you'll get errors like this:

spectral_cube/tests/test_regrid.py ...................................................F.FFF............F.F.F.F.......                                                                                                                   [100%]

================================================================================================================== FAILURES ===================================================================================================================
_____________________________________________________________________________________________________ test_mosaic_cubes[True-100-False0] ______________________________________________________________________________________________________

use_memmap = False, data_adv = PosixPath('/private/var/folders/k_/7qh4l0nn72b7qgq15pkd4hw40000gt/T/pytest-of-adam/pytest-171/test_mosaic_cubes_True_100_Fal0/adv.fits'), use_dask = True, spectral_block_size = 100

    @pytest.mark.parametrize('spectral_block_size,use_memmap', ((None, False),
                                                                (100, False),
                                                                (None, True),
                                                                (100, False),
                                                                (1, True),
                                                                (1, False),
                                                                ))
    def test_mosaic_cubes(use_memmap, data_adv, use_dask, spectral_block_size):

        pytest.importorskip('reproject')

        # Read in data to use
        cube, data = cube_and_raw(data_adv, use_dask=use_dask)

        # cube is doppler-optical by default, which uses the rest wavelength,
        # which isn't auto-computed, resulting in nan pixels in the WCS transform
        cube._wcs.wcs.restwav = constants.c.to(u.m/u.s).value / cube.wcs.wcs.restfrq

        expected_header = combine_headers(cube.header, cube.header)
        expected_wcs = WCS(expected_header).celestial

        # Make two overlapping cubes of the data
        part1 = cube[:, :round(cube.shape[1]*2./3.), :]
        part2 = cube[:, round(cube.shape[1]/3.):, :]

        assert part1.wcs.wcs.restwav != 0
        assert part2.wcs.wcs.restwav != 0

        # Mosaic give the expected header.
        result = mosaic_cubes([part1, part2],
                              # order is not a mosaic_cubes argument order='nearest-neighbor',
                              target_header=cube.header,
                              # not used roundtrip_coords=False,
                              spectral_block_size=spectral_block_size,
                              save_to_tmp_dir=False,
                              verbose=True,
                              use_memmap=use_memmap,
                              method='cube')

        # Check that the shapes are the same
        assert result.shape == cube.shape

        assert repr(cube.wcs.celestial) == repr(result.wcs.celestial)

        # Check WCS in reprojected matches wcs_out
        # (comparing WCS failed for no reason we could discern)
        assert repr(expected_wcs) == repr(result.wcs.celestial)

        # Check that values of original and result are comparable
>       np.testing.assert_almost_equal(result.filled_data[:].value,
                                       cube.filled_data[:].value, decimal=9)

spectral_cube/tests/test_regrid.py:651:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../mambaforge/envs/py312/lib/python3.12/contextlib.py:81: in inner
    return func(*args, **kwds)
../../mambaforge/envs/py312/lib/python3.12/contextlib.py:81: in inner
    return func(*args, **kwds)
../../mambaforge/envs/py312/lib/python3.12/site-packages/numpy/_utils/__init__.py:85: in wrapper
    return fun(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<function assert_array_almost_equal.<locals>.compare at 0x284f77920>, array([[[ 2.68308185e+185, -2.51563734e+252],
 ...0.39923534]],

       [[0.74684203, 0.85496775],
        [0.65468046, 0.96629004],
        [0.64365533, 0.74426717]]]))
kwds = {'err_msg': '', 'header': 'Arrays are not almost equal to 9 decimals', 'precision': 9, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not almost equal to 9 decimals
E
E           Mismatched elements: 24 / 24 (100%)
E           Max absolute difference among violations: 3.89622506e+287
E           Max relative difference among violations: 4.55716028e+287
E            ACTUAL: array([[[ 2.683081851e+185, -2.515637336e+252],
E                   [-9.039787586e+039,  3.043705986e+222],
E                   [-1.223304808e+051,  2.728961237e+167]],...
E            DESIRED: array([[[0.214686889, 0.909502371],
E                   [0.817235373, 0.903880475],
E                   [0.70687927 , 0.118076201]],...

../../mambaforge/envs/py312/lib/python3.12/contextlib.py:81: AssertionError
------------------------------------------------------------------------------------------------------------ Captured stdout call -------------------------------------------------------------------------------------------------------------
INFO: Using memory [spectral_cube.cube_utils]
INFO: Using Cube method [spectral_cube.cube_utils]
-------------------------------------------------------------------------------------------------------------- Captured log call --------------------------------------------------------------------------------------------------------------
INFO     astropy:cube_utils.py:1004 Using memory
INFO     astropy:cube_utils.py:1004 Using Cube method

keflavich avatar Jan 31 '25 20:01 keflavich

Ok perfect, thanks!

astrofrog avatar Feb 01 '25 09:02 astrofrog

Ok I have a MWE with just reproject_interp, so I'll see how we can fix it:

import numpy as np
from astropy.io import fits
from astropy.wcs import WCS
from astropy.utils.data import get_pkg_data_filename
from reproject import reproject_interp
from dask import array as da

hdu1 = fits.open(get_pkg_data_filename("galactic_center/gc_2mass_k.fits"))[0]
hdu2 = fits.open(get_pkg_data_filename("galactic_center/gc_msx_e.fits"))[0]

output_array = np.zeros(
    shape=(hdu1.header["NAXIS2"], hdu1.header["NAXIS1"]), dtype=">f8"
)

data = da.from_array(hdu2.data)
wcs = WCS(hdu2.header)

reproject_interp(
    (data, wcs), hdu1.header, output_array=output_array, block_size=(50, 50)
)

print(np.nanmin(output_array), np.nanmax(output_array))

and it happens for reproject_exact too, so this is something that needs fixing in the generic dispatcher.

astrofrog avatar Feb 01 '25 10:02 astrofrog

Ok I've spotted the bug, I will open a separate PR shortly

astrofrog avatar Feb 01 '25 13:02 astrofrog

@keflavich - see https://github.com/astropy/reproject/pull/487

astrofrog avatar Feb 01 '25 13:02 astrofrog

You should be able to rebase against main now as I have merged the other PR

astrofrog avatar Feb 01 '25 20:02 astrofrog

(Once you rebase this I will review and merge ASAP)

astrofrog avatar Feb 02 '25 00:02 astrofrog

rebase done

keflavich avatar Feb 02 '25 03:02 keflavich

has_celestial isn't necessary, but we need something to check that there is a valid WCS. The default is just WCS(), and it is possible (likely) for a user to pass a FITS HDU with a weight array and no WCS information in the header. What's the appropriate way to do this check under APE14?

keflavich avatar Feb 03 '25 14:02 keflavich

Your suggestion seems to be to just let it break downstream if there is an invalid WCS? That's fine, I guess. I'm not a fan of that strategy in general; I like to catch obvious errors where I know they'll occur.

keflavich avatar Feb 03 '25 14:02 keflavich

@keflavich - I think reproject_and_coadd can be used for any arbitrary WCS that might not even be celestial? Or is that incorrect? In that sense, even ignoring APE 14 it does not make sense to restrict this?

astrofrog avatar Feb 03 '25 14:02 astrofrog

right.... the cube modifications made that possible. OK, I'll get rid of has_celestial, but I think the error messages may be opaque

keflavich avatar Feb 03 '25 14:02 keflavich

This fails now with:

E           NotImplementedError: Currently only data with a 2-d celestial WCS can be reprojected using flux-conserving algorithm

should we just skip these tests?

keflavich avatar Feb 03 '25 14:02 keflavich

Also, ValueError: Output WCS has celestial components but input WCS does not. That's the expected error; I haven't looked more closely yet but at least the error is clear.

keflavich avatar Feb 03 '25 15:02 keflavich

pre-commit.ci autofix

astrofrog avatar Feb 05 '25 13:02 astrofrog

@keflavich - sorry for the long delay in getting this in, this all looks good now!

astrofrog avatar Apr 28 '25 13:04 astrofrog