GLS
GLS copied to clipboard
NaN response from prob with norm lnL
Hi,
I get a NaN response from prob here. Is the calculation in line 652/3 correct?
Stefan
import gls as ptpp
N = 100
time = np.random.uniform(54000., 56000., N)
flux = 0.15 * np.sin(2. * np.pi * time / 10.)
# Add some noise
flux += 0.5 * np.random.normal(0, 1, time.size)
error = 0.5 * np.ones(time.size)
gls = ptpp.Gls((time, flux, error), verbose=True, norm="lnL")
Indeed, I also can get nan
, e.g.:
>>> gls.prob(-10)
nan
https://github.com/mzechmeister/GLS/blob/a1ee47b4779cb53bc998f99304644d83d36b56ba/python/gls.py#L651-L654
A nan
can arise when you pass a positive value for lnL
(here Pn
) or not negative enough. This can lead to a negative value for 1-p
in the last line and rising it to some power causes the nan
.
But lnL
must be negative and indeed all values of gls.power should be negative.
I wanted to show that the same happens with with norm ZK
and powers larger than one. Indeed, it does, but to my surprise now an error message occurs:
>>> gls = ptpp.Gls((time, flux, error))
>>> gls.prob(1.2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "gls.py", line 654, in prob
return (1-p) ** ((self.N-3)/2)
ValueError: negative number cannot be raised to a fractional power
It turns out that for lnL
the values are converted into np.float64
and the error can be surpressed also for ZK
in this way:
>>> gls.prob(np.float64(1.2))
nan
A wrong input could be catched, but the code would be blown up with a of lot error handling for all the norms and methods (prob
, probInv
, etc.).
OK.
What confused me was that it happened with quite legitimate periodograms. I used this kind of code.
maxPower = gls.pmax
print("maxPower: ", maxPower)
print("stats: ", gls.stats(maxPower))
Problem is pmax, which is derived from self.p in _peakPeriodogram (not subject to later renormalization). self.power is the renormalized quantity. Also with pmax = max(gls.power), I get a weird result along the lines of
stats: {'Pn': -103.12840147567547, 'Prob': 7246890160.376161, 'FAP': nan}
I cannot confirm this. I get the same result
>>> gls = ptpp.Gls((time, flux, error))
>>> gls.stats(gls.pmax)
{'FAP': 0.2661578990397514, 'Pn': 0.09955558972770197, 'Prob': 0.006182440735281105}
>>> gls = ptpp.Gls((time, flux, error), norm="lnL")
>>> gls.stats(max(gls.power))
{'FAP': 0.2661578990397514, 'Pn': -71.34146478841117, 'Prob': 0.006182440735281105}
But I agree that the naming is confusing and the handling is not safe.
A suggestion is to rename p
to _p
(to indicate that it is an internal array) and power
to p
(P
should be associated with for period). I think this is more intuitive.
The code will be not fully backwards compatible.