ddf-pipeline
ddf-pipeline copied to clipboard
Determine NDir automatically
Working on a galactic field. One of the identified issue for me is that the noise is higher is the galactic fields, and there are "less" sources to calibrate against our 45 directions.
We did something similar to find the number of data point in a solve step here:
https://github.com/mhardcastle/ddf-pipeline/blob/3d2bda36aef9be51393a7c08e76047afbbc6db50/scripts/pipeline.py#L645
So if I have a good estimate of the variance V
in the visibilities, and a given dt
-dnu
of the solution interval, what's the maximum NDir
?
Perhaps a formulae for the noise in the Jones terms estimates given the aforementioned quantities would be enough
And assuming a signal S
uniformly distributed over the field...
I think that's the version for NDir=1
: V_g \proto Var_v / (n_bin_in_sol * S**2)
So would we have V_g \proto NDir * Var_v / (n_bin_in_sol * S**2)
?
Hi @cyriltasse Yes because if we assume the signal distribution is unifrom. Then conribution by each direction is S/NDir.
I think it should be NDir^2 actually. so V_g \proto (NDir ^2)* Var_v / (n_bin_in_sol * S^2)
V_g \proto (NDir ^2)* Var_v / (n_bin_in_sol * S^2)
and n_bin_in_sol
ain't squared?
No that is not squared. That does not change.
ok - thanks... I'll try to put that in the pipeline asap, and see how it improves the galactic fields... it should definitly...
Do you mean maximum or optimal number of directions? Maximum is easy. At least the maximum for a unique solution to exist. Optimal is a different story entirely... I don't see how that formula translates to the DD case so trivially. One of the assumptions to get to the DI formula is that the flux is concentrated at the center of the field. Might still provide a useable rule of thumb but I'm not sure
So what's the maximum then @landmanbester ? What's the formulae above for?
Well, without a regulariser, you need fewer parameters than data points so roughly
Ndir * (Nt/Delt) * (Nv/Delv) * Ncor * Na < Nt * Nv * Ncor * Na(Na-1)/2
or
Ndir / (Delt * Delv) < (Na - 1)/2
where Delt and Delv are the solution interval widths. I think that formula tells you how wide your solution interval has to be to get a certain variance on the gain solutions if S is the flux at the center of the field. I suspect you can probably come up with a similar rule of thumb for the DD case but I would have to think about it. Maybe @ulricharmel already has though and I'm just to thick to see it. @ulricharmel can you elaborate a bit on how you extended that to the DD case? Did you work it out from the Cramer-Rao bound?
This is true for a linear system, right?
This is how I came up with that approximation
Ah ok. This is a bit of a leap of faith but might still give you a useful rule of thumb. Note that the m_{dpqt} term is really the source coherency and is not equal to the amplitude. Looking at 20, which implies that the variance scales as Ndir^2, this will probably give you a conservative estimate of the maximum number of directions for a predetermined variance on the gain solutions. I suspect it's not quite that bad in practice.
As for what I said about the max number of directions above, for linear models with a Gaussian likelihood, for sure, the Hessian becomes singular and you need to use the pseudo-inverse to get a soln. For non-linear models, I think you can have cases where the Hessian becomes non-positive definite as well. At least in my experiments with StefCal (no Wirtinger) and using the full Hessian (as opposed to only the diagonal) I had to use the pseudo-inverse when the condition on the maximum number of directions was violated. This also gives gain solutions that did not match the input simulated gains at all (using only 2 correlations so no unitary ambiguity) which is probably due to overfitting. Note that this applies to Gaussian likelihoods (and so also NLLS) but not necessarily to distributions with heavier tails. Anyway, my 2c on the matter. @cyriltasse please let us know if 20 turns out to be useful
@landmanbester you are totally right m_{dpqt} is coherency and will only be equal to the Amplitude if the source is at the phase center.
Hi @cyriltasse I was looking at you function again. I just notice you are computing M as the mean of the data col. I thought you will want to do that with the mean of Model given that data has both noise and RFI. You get the model on this line but you don't use it https://github.com/mhardcastle/ddf-pipeline/blob/3d2bda36aef9be51393a7c08e76047afbbc6db50/scripts/pipeline.py#L651
Also, you defined nb=T**2/(M/S)**2, without including the factor for number of antennas (Na-3).