mgwr
mgwr copied to clipboard
Formula of get_AICc
In diagnostics.py
, get_AICc formula is
https://github.com/pysal/mgwr/blob/2a955355e31fe4a124b49ce4723335d308ee09d2/mgwr/diagnostics.py#L11-L30
However, as written in its comment, it depends on the setting of GLM
. Thus its result is different from the AICc definition described in Li et al. (2019).
(Even if I changed sigma2_v1
parameter of gwr.GWR
), the resulting AICc value did not change.)
I suspect that the following code is consistent with the definition above.
gwr.n * (np.log(np.sum(np.square(gwr.resid_response))) - np.log(gwr.n-gwr.ENP) + np.log(2*np.pi) + (gwr.n+gwr.ENP) / (gwr.n-2-gwr.ENP))
Is there any reason why the current implementation is employed?
- Li, Z., Fotheringham, A. S., Li, W., & Oshan, T. (2019). Fast Geographically Weighted Regression (FastGWR): a scalable algorithm to investigate spatial process heterogeneity in millions of observations. International Journal of Geographical Information Science, 33(1), 155–175. https://doi.org/10.1080/13658816.2018.1521523
Hi @hayato-n. The only difference is the denominator when calculating the error variance, where here in mgwr is using the MLE (RSS/n) and in the Li paper is described using the unbiased estimator (RSS/(n-k)). I think it is more common to use the MLE one that is implemented here, so to be consistent, the later update of fastgwr uses MLE (Link).
Hi @Ziqi-Li, thanks for your reply.
I confirmed that the following code is consistent with the get_AICc
's behaviour.
# ML
sigma2 = np.sum(np.square(gwr.resid_response)) / gwr.n
# unbiased
# sigma2 = np.sum(np.square(gwr.resid_response)) / (gwr.n - gwr.ENP)
# AICc
gwr.n * (np.log(sigma2) + np.log(2*np.pi) + (gwr.n+gwr.ENP) / (gwr.n-2-gwr.ENP))
I suspect it is not intuitive that the parameter sigma2_v1
does not affect the AICc formula. It will be more desirable if the reason is written in the comment in get_AICc
or its documentation.
Thank you again for your helpful comment!
Hi @hayato-n, great you find it consistent now. I think sigma_v1
(which is calculated based on the denominator n-k
) is actually not used in the AIC formula so modifying it doesn't affect the outcome.
Yes, you are right, sigma_v1
does not affect. I think your comments here are informative, thus I will send a small pull request to clarify the AICc formula. Please check and accept if you like it.