HENDRICS icon indicating copy to clipboard operation
HENDRICS copied to clipboard

How to get uncertainty in peak location from HENzsearch

Open paulray opened this issue 5 years ago • 17 comments

When I run HENzsearch with --fit-candidates, I see that I can get the parameters of the SincSquareModel by using

In [15]: efperiod = load_folding('cleanfilt_bary_nicer_xti_Z2n.nc')  
In [16]: efperiod.best_fits                                                                                                                               
Out[16]: [<SincSquareModel(amplitude=116321.52283363, mean=0.10777897, width=0.0000043)>]

This give the best fit frequency and the width of the sinc function. It would be nice if there were a way to easily get the uncertainty in the frequency determination. Can that be included somehow? Is that one of the outputs the fitter could return?

paulray avatar Feb 27 '20 18:02 paulray

Hi @paulray, I would personally calculate the TOAs after aligning the profile with HENphaseogram (set --ntimes to a small enough number so to have a clearly visible profile throughout) image and use PINT to calculate the frequency errorbar :).

The largest error contribution in the sinc fits is often due to the imperfect shape of the sync profile (because fdot), which is not encoded in any error estimate the fitting routine can do (because local minima). E.g. from the HENDRICS video tutorial:

image

If you the profile is well fit, the width of the sync is an upper limit to the error. If you have enough signal to get good TOAs, I would calculate the actual error by using PINT.

matteobachetti avatar Feb 28 '20 07:02 matteobachetti

You have a video tutorial?!?!

paulray avatar Feb 28 '20 11:02 paulray

A little outdated but yes :)

matteobachetti avatar Feb 28 '20 12:02 matteobachetti

OK, thinking about this a bit more, given the fact that Fdot, and funky window functions, tend to perturb the shape of the sinc^2 response, why are you fitting such a large regions of the Z^2 vs F space? Why not just fit a few bins around the peak? Wouldn't that be more robust? Then the error estimate should be more reliable based on the local curvature of the Z^2 vs F plane.
I'm not sure how the Fdot searches work, but you could also map out the local maximum in the 2-D F-Fdot plane to get errors on both quantities.

My main motivation here is to make a pipeline that gets reasonably reliable frequency measurements with uncertainties when I have lots of observations to process. I totally agree that fitting TOAs gives the best estimate and when I am doing things manually I definitely do that. But I've I've got 100 monitoring observations of a source I want to have a reasonably good solution that can be scripted with no manual intervention.

paulray avatar Feb 28 '20 13:02 paulray

@paulray sorry for the slow response

OK, thinking about this a bit more, given the fact that Fdot, and funky window functions, tend to perturb the shape of the sinc^2 response, why are you fitting such a large regions of the Z^2 vs F space?

I'm actually only fitting a very small space, equal to 5 * Df * oversample, where Df is the step. Df * oversample is equal to the frequency resolution of the FFT, so this is a very small frequency space indeed. The Z search is typically oversampled by a factor 16, so it amounts to the 160 bins around the peak.

Re the f/fdot 2d fit, this is indeed an improvement I will work on sooner or later. I need 48-hr days :P

matteobachetti avatar Mar 13 '20 09:03 matteobachetti

Understood!

What I meant was a very small region, as in less than the width of the main "lobe" of the sinc function, so that the fit was just trying to find the centroid of the main lobe and not trying to replicate the location or depth of the zeros or lower sidelobes. In some of my codes I've just computed the centroid from the 3 bins around the peak using a parabolic approximation.

paulray avatar Mar 13 '20 10:03 paulray

Oh, I see. Isn't it a little shaky when the peak is oversampled?

matteobachetti avatar Mar 13 '20 12:03 matteobachetti

Yeah, could be. The optimal thing to do might depend on the oversample factor.

paulray avatar Mar 13 '20 12:03 paulray

See this example. Notice how the fit is underestimating the peak and overestimating the width. But maybe I just didn't oversample enough?

cleanbary_nicer_xti_Z2n nc

paulray avatar Mar 13 '20 12:03 paulray

Oh, that is actually an effect of the orbital gaps. Your observation is longer than an orbit, right?

matteobachetti avatar Mar 13 '20 13:03 matteobachetti

When the orbital gaps are of the same order of magnitude of GTIs, the two paths I show here in the phaseogram give very similar folding/Z2 statistics

image

The actual error on the period is much larger, because the two paths, with very different period values, give similar statistics. Include Poisson error in the equation, and practically any of those peaks might be the right one.

matteobachetti avatar Mar 13 '20 13:03 matteobachetti

I realize that what I wrote is not super clear.

Basically, the sinc function we fit to the periodogram assumes an ideal situation without orbital gaps. Then, the width of the sinc is given by a factor times the inverse of the observing time and fitting the side lobes improves the fit considerably. This is what we assume in the peak fit.

However, with orbital gaps, that ideal curve is modulated and distorted by the effect I showed in the figure above. The longer the orbital gaps with respect to GTIs, the worse.

Igor Chilingarian made a more mathematical approach to the problem and it is described in the Appendix of this paper: https://iopscience.iop.org/article/10.3847/1538-4357/aa689d/pdf (In this case, it was applied to multiple observations, but the concept is the same)

matteobachetti avatar Mar 13 '20 14:03 matteobachetti

Thanks for the pointer to that paper. I'll check it out!

paulray avatar Mar 13 '20 18:03 paulray

And yes, this is very gappy data, so the actual response doesn't look like the ideal sinc function situation. And definitely I understand the integer phase ambiguity issues. They are part of life with NICER data. :-)

paulray avatar Mar 13 '20 18:03 paulray

When I run HENzsearch with --fit-candidates, I see that I can get the parameters of the SincSquareModel by using

In [15]: efperiod = load_folding('cleanfilt_bary_nicer_xti_Z2n.nc')  
In [16]: efperiod.best_fits                                                                                                                               
Out[16]: [<SincSquareModel(amplitude=116321.52283363, mean=0.10777897, width=0.0000043)>]

This give the best fit frequency and the width of the sinc function. It would be nice if there were a way to easily get the uncertainty in the frequency determination. Can that be included somehow? Is that one of the outputs the fitter could return?

Excuse my ignorance, but when I do HENzsearch with --fit-candidates I get a __.p file where I guess the model parameters are stored. How do I read that file?. Thanks

gjmlunaIAFE avatar Jul 27 '21 22:07 gjmlunaIAFE

@gjmlunaIAFE you use the pickle library: https://docs.python.org/3/library/pickle.html

matteobachetti avatar Jul 28 '21 07:07 matteobachetti

Hello Matteo, could you ellaborate a bit more on how to get the uncertainty in a period found with HENzsearch?. I found some notes from an school you taught (but I can't find then anymore). I think I understand from the video that I should use the HENphaseogram to get the TOA, but afterwards I don't know how to proceed. I'm studying a few NICER orbits of an object and would like to address if the period has changed. Thanks!, Juan

gjmlunaIAFE avatar Sep 07 '21 18:09 gjmlunaIAFE