reproject
reproject copied to clipboard
reproject_interp() fails silently for very large images
I'm trying to use reproject in a pipeline that makes a weighted mosaic of several 20000x20000 images. reproject_interp() gives essentially identical output to reproject_exact() for small output images (e.g. ~ 30000 x 20000) but at some point starts failing silently, returning an output image with every pixel, even ones that would have been outside the original image, set to some arbitrary value (comparable to the central value in the actual image). reproject_exact() can cope with the same images but is of course quite a lot slower; however, the point is that this looks like a bug. I can try to produce a self-contained bit of code that would demonstrate this if it's needed.
Thanks for the report - it would be great if you could create a 'minimal' failing example to reproduce this.
OK, will do.
OK, here we go. This triggers the problem for me.
Note you'll need a system with tens of GB of RAM to run it, and it will take a little while.
For me, reducing the big image size to say 20000 or less, or using reproject_exact() instead, makes this work fine (there should be a small random image in the bottom corner), but as it is here it will produce a 40000x40000 image each pixel of which will have the same value.
from reproject import reproject_interp,reproject_exact
from astropy.io import fits
from astropy.wcs import WCS
import numpy as np
# pick algorithm to use
reproj=reproject_interp
mra=180.0
mdec=45.0
cellsize=1.0/3600
ctype=('RA---SIN','DEC--SIN')
# pick sizes here
rxsize=rysize=40000
xsize=ysize=1024
rwcs=WCS(naxis=2)
rwcs.wcs.ctype=ctype
rwcs.wcs.cdelt=(-cellsize,cellsize)
rwcs.wcs.crval=[mra,mdec]
rwcs.wcs.crpix=[1,1]
rheader=rwcs.to_header()
rheader['NAXIS']=2
rheader['NAXIS1']=rxsize
rheader['NAXIS2']=rysize
header=rwcs.to_header()
header['NAXIS']=2
header['NAXIS1']=xsize
header['NAXIS2']=ysize
data=np.random.rand(ysize,xsize)
hdu=fits.PrimaryHDU(header=header,data=data)
r,footprint=reproj(hdu, rheader, hdu_in=0)
fits.PrimaryHDU(header=rheader,data=r).writeto('output.fits',clobber=True)
@mhardcastle - thanks for the example! Before I try and run it (it will take me a few days to get hold of a machine that can run it) can you check whether it is sufficient for one of the dimensions to be ~40000 for this to fail? (for instance does 40000x4000 work?)
Good question. No, I'm afraid 40000x4000 does work, though almost certainly the overall size of the output image doesn't need to be quite as big as 40000x40000 to provoke the problem.
@mhardcastle - I can't find a machine to try this on, so could you run the following examples and let me know what happens?
io.fits
This is to check whether astropy.io.fits has issues when writing out arrays this size:
import numpy as np
from astropy.io import fits
from scipy.ndimage import map_coordinates
N = 1024
M = 40000
array = np.random.random((N,N))
output = np.zeros((M,M))
output[:N,:N] = array
fits.writeto('output.fits', output, clobber=True)
map_coordinates
Assuming the above example works, this is to check whether map_coordinates has issues with arrays this large:
import numpy as np
from scipy.ndimage import map_coordinates
N = 1024
M = 40000
array = np.random.random((N,N))
x = np.linspace(0, N * 2, M)
y = np.linspace(0, N * 2, M)
X, Y = np.meshgrid(x, y)
coords = np.vstack([X.ravel(), Y.ravel()])
output = map_coordinates(array, coords).reshape(X.shape)
from astropy.io import fits
fits.writeto('output.fits', output, clobber=True)
exact array size
If the above example fails, could you use it to figure out more precisely where the transition happens? Could you try for instance to see whether the behavior changes around sizes of 32768? (2^15)
(there was a typo in the first example, it should have had M=40000
)
First one is fine — consistent with the fact that the problem doesn't appear with reproject_exact(). I'll report back on the second when it's finished.
I can arrange an account for you on a cluster with a bunch of suitable machines, but let's see if that's necessary first.
The second code fragment also runs fine (i.e. a scaled version of the 1024x1024 appears in the lower left quadrant of the output array).
I'll see if I can see whether 2^15 is special with my example above. It's certainly close to the boundary where things stop working.
@mhardcastle - another test (if you don't mind) would be to edit the code here:
https://github.com/astrofrog/reproject/blob/master/reproject/interpolation/high_level.py#L91
to force _reproject_full
to get called instead of _reproject_celestial
, just to check whether the bug is in _reproject_celestial
.
Well, that's remarkable: image sizes of 2^15 x 2^15 demonstrate the problem with my test code above but an image of 2^15-1 on a side does not. So I guess something about image sizes >= 2^30 elements is the root of the problem.
I'll have a look at the test you suggest.
Just to check, what OS are you using?
Linux: Scientific Linux 7.
@mhardcastle - just to check, did you have a chance to try the experiment above with _reproject_full?
I think this does highlight in any case the need to have a way to do the reprojection in chunks (https://github.com/astrofrog/reproject/issues/37).
I haven't had a spare moment, but I haven't forgotten -- I need to get this resolved to make progress on another project (or give up and go back to montage-wrapper ...) so it's not going to get buried.
(And yes, chunking would be very useful. Particularly if that allowed parallel mode to work in reproject_exact() -- it crashes on these big images presumably because it runs out of memory.)
@mhardcastle - see https://github.com/astrofrog/reproject/issues/37#issuecomment-267645100
Just checking in to see if there are any recent updates or work arounds available for this. I have a fairly large image that reproject_interp
is having issues with. The returned array is filled with either nan
or a single number. @astrofrog the link above gives me a page not found error. I did see reference to another issue the chunks the data to keep reproject_interp
from falling over, but when trying to reproduce that approach it seems like some of the core utilities have been restructured, and I am not familiar enough with the code base to find the equivalent functions.
I would also love to use chunking with reproject_and_coadd()
We have been using the chunking workround discussed in 2016 in production since then: see https://github.com/mhardcastle/ddf-pipeline/blob/master/utils/reproj_test.py
Hi everyone, with the latest version of reproject you can now pass e.g. block_size=(2048, 2048)
to operate in chunks. I'll close the issue now, but feel free to re-open it if you are still having issues with large arrays/chunking!