SourceXtractorPlusPlus
SourceXtractorPlusPlus copied to clipboard
Problems applying SE++ to a VIS coadd image
When fitting a bulge+disk model to VIS coadded data the comparison between auto_mag (== TU mags) and the fitted mags looks like this:
The PSF is a bit naive (2.0pix Gaussian). I tried a lot but could not significantly improve the offset(s) and I wonder whether the known deficiencies in the data and the setup can explain the large offsets.
The data is here: https://deepdip.iap.fr/#folder/624e9eeb28cafb12e8553f0c
I am getting kind of desperate about this issue. In the last weeks I really tried a lot:
- using different fitting models (Sersic, PSF, Disk+Bulge);
- using different PSF's (Gaussian, the PSFEx from the morphology challenge, the coadded PSF);
- changing the configuration files (ascii and python);
At the end of the day I am using the configuration files from the morphology challenges on the Euclid coadd and the Morphology Challenge image and get this comparison with auto_mag, which is comparable to the TU mag on this level:
It works on the MC image but not on the Euclid coadd. In the red cloud in the upper right the fitting basically gets zero flux, hence the humongous offset. Even if this cloud would not be there the read points would still be bad.
Top left: residual Top right: model Bottom left: science image.
From the residual it looks like the model fitting is not completely breaking down, the negative pixel value in the centre is around a quarter of the flux of the original image - so the model is 25% too high in the centre - not too unusual I guess. But that model-subtracted flux doesn't appear at all in the model checkimage. (the colour bars are matched across the frames).
I have done some tests last Friday, the model fitting is clearly happening so it's not like something is completely wrong. Overall the model check image looks mostly correct by eye, just a bit off in flux.
Could it be a PSF related problem? Maybe the variable PSF is not being correctly applied?
As discussed in the telecon, I ran the dataset with 3 different PSF models (Gaussian, psfex, coadded PSF). The offsets are similar in all three cases.
In another test I used a constant RMS image with the background RMS value. Did not help.
I did a fairly complete parameter study and changed, from a baseline solution, all conceivable parameters that could have an effect on the fitting (can provide a protocol if desired).
The only parameter that changed the photometry offset significantly was changing set_modified_chi_squared_scale=0.01
(see #487) then the offset range changes from [0,15mag] to [-2,2], which is not good as well.
I made a small script to derive the reduced chi-square value from output imaging material (segmentation image, residual image, rms image).
When I compare the reduced chi-2 values from the SE++ fitting with derived reduced chi-2 values I get this:
For the Morphology Challenge data both chi-2 estimates are in the right range. I do not expect a close correlation, but the numbers are around in the right ballpark.
For the problematic dataset the reduced chi-2 values from the fit are way smaller than expected (~0.2), but the derived values are about a factor 10 larger. Looking at the residual image, the derived chi-2 values of 2 and larger are more realistic than the tabulated values <<1 which indicates overfitting.
At the end of the day, I cant understand why the fitting ends up with so small chi-2 values and I can not reproduce them even qualitatively.
Turns out that the fit are kind of reasonable:
when the chi-2 scale is changed to:
set_modified_chi_squared_scale(0.003)
With this settings a disk+bulge fit works reasonably well with Gaussian PSF, a PSFEX file and the coadded PSF.
What is not clear is:
- why this change makes such a big difference;
- a value <1.0 gives more weight to the wings of the objects which is counterintuitive;
It could be that the missing Poisson noise in the RMS or the correlated noise (coadded data) is responsible for the strange behaviour.
As discussed yesterday I added Poisson noise to the RMS and ran SE++ with the PSFEX model (with the default mod_scale). The results are a bit better, but not really good:
Also when setting the
set_modified_chi_squared_scale(0.003)
the fitting works.
The Poisson noise was added with:
poisson_noise = rms_data+numpy.sqrt(numpy.fabs(img_data)*exp_time)/exp_time
Lastly, I checked whether providing a gain value does help. I do that by giving explicitly the value
--detection-image-gain 3.5 --
and --weight-type background
in both the ASCII and python configuration files.
Also that one did not help....
It's really persistent isn't it.... Is there some way to translate the chi-sq scale param to an effective gain?
Depending on how the coadd was constructed, you might want to set the gain to G = gain * total_exp_time
Because the counts are scaled to 1s of exposure in many coadd images. (or something like that - Emmanuel can correct me)
Happy to check that out if I know exactly what and how.
The image indeed has an effective exptime of 1s.
There are 4 exposure with 565s each, that would mean: G = 3.5 * 4 * 565. = 7910.0
Would that be correct (that's quite a high value, similar to inf.=0.0)?
Running with the very large gain value is so far the only reduction with a reasonable result for the photometry and without the modified chi-2 scaling:
Actually the result seems to be even better than the the ones above withe the modified chi-2 scale.
This seems like the answer. I don't recall exactly what Emmanuel said on the call, but I think the essence of it was that the chi-2 scale was doing the same thing as the gain would do. Maybe if you set the chi-2 scale to 1/7910 = 1.26e-4 you'd get a similar result?
Emmanuel should probably check this, but I think what you want is,
poisson_noise = numpy.sqrt(numpy.fabs(img_data)*exp_time*gain)/(exp_time*gain)
total noise = numpy.sqrt(poisson_noise**2 + background_rms**2)
If gain
is the effective gain (not the original instrumental gain) and img_data
the pixel value with background subtracted, then an estimate of the standard deviation of the total noise would simply be
total_noise_rms = numpy.sqrt(numpy.fabs(img_data
) / gain
+ background_rms^2)
With this formula:
total_noise_rms = numpy.sqrt(numpy.fabs(
img_data
) /gain
+ background_rms^2) the noise gets very large and the 'old' detection parameters are kind of obsolete.
Shouldn't there be the exposure time in that equation?
Emmanuel's formula is a simplification of the one I wrote, where he's using the effective gain,
(effective) gain = gain * exposure time
So it is in there implicitly. Does that clear it up, or where you thinking it should appear somewhere else?
No, exposure time is out of the equation because the GAIN
in the FITS header is assumed to be the effective gain, not the original detector gain before data rescaling. Other software might interpret the GAIN
keyword differently, but this is how the original SExtractor did. Note that exposure time might not be the only contributor to changes to the effective gain (for instance if the data producer wants the results in other units, e.g., eV), so anyway the most convenient and safest way I think is to assume that GAIN
is the effective gain.
I was using the above formula(s) with:
(effective) gain = gain * exposure time = 3.48 * 4 * 565.
The gain comes from the calibrated images and the exposure time is clear. This leads to:
That's very similar to other results above modifying the chi-square scale. I also ran with the variations of the modified gain (factor 2 in both directions). That changes little.