SourceXtractorPlusPlus icon indicating copy to clipboard operation
SourceXtractorPlusPlus copied to clipboard

Problems applying SE++ to a VIS coadd image

Open mkuemmel opened this issue 2 years ago • 19 comments

When fitting a bulge+disk model to VIS coadded data the comparison between auto_mag (== TU mags) and the fitted mags looks like this: VIS_fitting The PSF is a bit naive (2.0pix Gaussian). I tried a lot but could not significantly improve the offset(s) and I wonder whether the known deficiencies in the data and the setup can explain the large offsets.

The data is here: https://deepdip.iap.fr/#folder/624e9eeb28cafb12e8553f0c

mkuemmel avatar Apr 07 '22 08:04 mkuemmel

I am getting kind of desperate about this issue. In the last weeks I really tried a lot:

  • using different fitting models (Sersic, PSF, Disk+Bulge);
  • using different PSF's (Gaussian, the PSFEx from the morphology challenge, the coadded PSF);
  • changing the configuration files (ascii and python);

At the end of the day I am using the configuration files from the morphology challenges on the Euclid coadd and the Morphology Challenge image and get this comparison with auto_mag, which is comparable to the TU mag on this level: fitting_problem

It works on the MC image but not on the Euclid coadd. In the red cloud in the upper right the fitting basically gets zero flux, hence the humongous offset. Even if this cloud would not be there the read points would still be bad.

mkuemmel avatar May 31 '22 10:05 mkuemmel

Screenshot from 2022-06-03 20-55-19

Top left: residual Top right: model Bottom left: science image.

From the residual it looks like the model fitting is not completely breaking down, the negative pixel value in the centre is around a quarter of the flux of the original image - so the model is 25% too high in the centre - not too unusual I guess. But that model-subtracted flux doesn't appear at all in the model checkimage. (the colour bars are matched across the frames).

WillHartley avatar Jun 03 '22 19:06 WillHartley

I have done some tests last Friday, the model fitting is clearly happening so it's not like something is completely wrong. Overall the model check image looks mostly correct by eye, just a bit off in flux.

Could it be a PSF related problem? Maybe the variable PSF is not being correctly applied?

marcschefer avatar Jun 07 '22 12:06 marcschefer

As discussed in the telecon, I ran the dataset with 3 different PSF models (Gaussian, psfex, coadded PSF). The offsets are similar in all three cases.

In another test I used a constant RMS image with the background RMS value. Did not help.

mkuemmel avatar Jun 27 '22 14:06 mkuemmel

I did a fairly complete parameter study and changed, from a baseline solution, all conceivable parameters that could have an effect on the fitting (can provide a protocol if desired).

The only parameter that changed the photometry offset significantly was changing set_modified_chi_squared_scale=0.01 (see #487) then the offset range changes from [0,15mag] to [-2,2], which is not good as well.

mkuemmel avatar Jun 27 '22 15:06 mkuemmel

I made a small script to derive the reduced chi-square value from output imaging material (segmentation image, residual image, rms image). When I compare the reduced chi-2 values from the SE++ fitting with derived reduced chi-2 values I get this: chisquare_comparison For the Morphology Challenge data both chi-2 estimates are in the right range. I do not expect a close correlation, but the numbers are around in the right ballpark. For the problematic dataset the reduced chi-2 values from the fit are way smaller than expected (~0.2), but the derived values are about a factor 10 larger. Looking at the residual image, the derived chi-2 values of 2 and larger are more realistic than the tabulated values <<1 which indicates overfitting.

At the end of the day, I cant understand why the fitting ends up with so small chi-2 values and I can not reproduce them even qualitatively.

mkuemmel avatar Jun 27 '22 15:06 mkuemmel

Turns out that the fit are kind of reasonable: image when the chi-2 scale is changed to: set_modified_chi_squared_scale(0.003) With this settings a disk+bulge fit works reasonably well with Gaussian PSF, a PSFEX file and the coadded PSF. What is not clear is:

  • why this change makes such a big difference;
  • a value <1.0 gives more weight to the wings of the objects which is counterintuitive;

It could be that the missing Poisson noise in the RMS or the correlated noise (coadded data) is responsible for the strange behaviour.

mkuemmel avatar Jun 30 '22 09:06 mkuemmel

As discussed yesterday I added Poisson noise to the RMS and ran SE++ with the PSFEX model (with the default mod_scale). The results are a bit better, but not really good: noise_comparison Also when setting the set_modified_chi_squared_scale(0.003) the fitting works.

The Poisson noise was added with: poisson_noise = rms_data+numpy.sqrt(numpy.fabs(img_data)*exp_time)/exp_time

mkuemmel avatar Jul 01 '22 11:07 mkuemmel

Lastly, I checked whether providing a gain value does help. I do that by giving explicitly the value --detection-image-gain 3.5 -- and --weight-type background in both the ASCII and python configuration files. fit_gain Also that one did not help....

mkuemmel avatar Jul 04 '22 11:07 mkuemmel

It's really persistent isn't it.... Is there some way to translate the chi-sq scale param to an effective gain?

Depending on how the coadd was constructed, you might want to set the gain to G = gain * total_exp_time

Because the counts are scaled to 1s of exposure in many coadd images. (or something like that - Emmanuel can correct me)

WillHartley avatar Jul 04 '22 12:07 WillHartley

Happy to check that out if I know exactly what and how.

The image indeed has an effective exptime of 1s.

There are 4 exposure with 565s each, that would mean: G = 3.5 * 4 * 565. = 7910.0

Would that be correct (that's quite a high value, similar to inf.=0.0)?

mkuemmel avatar Jul 04 '22 13:07 mkuemmel

Running with the very large gain value is so far the only reduction with a reasonable result for the photometry and without the modified chi-2 scaling: fit_gain7910

Actually the result seems to be even better than the the ones above withe the modified chi-2 scale.

mkuemmel avatar Jul 06 '22 13:07 mkuemmel

This seems like the answer. I don't recall exactly what Emmanuel said on the call, but I think the essence of it was that the chi-2 scale was doing the same thing as the gain would do. Maybe if you set the chi-2 scale to 1/7910 = 1.26e-4 you'd get a similar result?

WillHartley avatar Jul 06 '22 13:07 WillHartley

Emmanuel should probably check this, but I think what you want is,

poisson_noise = numpy.sqrt(numpy.fabs(img_data)*exp_time*gain)/(exp_time*gain)

total noise = numpy.sqrt(poisson_noise**2 + background_rms**2)

WillHartley avatar Jul 07 '22 08:07 WillHartley

If gain is the effective gain (not the original instrumental gain) and img_data the pixel value with background subtracted, then an estimate of the standard deviation of the total noise would simply be total_noise_rms = numpy.sqrt(numpy.fabs(img_data) / gain + background_rms^2)

ebertin avatar Jul 07 '22 08:07 ebertin

With this formula:

total_noise_rms = numpy.sqrt(numpy.fabs(img_data) / gain + background_rms^2) the noise gets very large and the 'old' detection parameters are kind of obsolete.

Shouldn't there be the exposure time in that equation?

mkuemmel avatar Jul 11 '22 11:07 mkuemmel

Emmanuel's formula is a simplification of the one I wrote, where he's using the effective gain,

(effective) gain = gain * exposure time

So it is in there implicitly. Does that clear it up, or where you thinking it should appear somewhere else?

WillHartley avatar Jul 11 '22 11:07 WillHartley

No, exposure time is out of the equation because the GAIN in the FITS header is assumed to be the effective gain, not the original detector gain before data rescaling. Other software might interpret the GAIN keyword differently, but this is how the original SExtractor did. Note that exposure time might not be the only contributor to changes to the effective gain (for instance if the data producer wants the results in other units, e.g., eV), so anyway the most convenient and safest way I think is to assume that GAIN is the effective gain.

ebertin avatar Jul 11 '22 17:07 ebertin

I was using the above formula(s) with: (effective) gain = gain * exposure time = 3.48 * 4 * 565. The gain comes from the calibrated images and the exposure time is clear. This leads to: VIS_poisson_comp

That's very similar to other results above modifying the chi-square scale. I also ran with the variations of the modified gain (factor 2 in both directions). That changes little.

mkuemmel avatar Jul 13 '22 14:07 mkuemmel