mgwr icon indicating copy to clipboard operation
mgwr copied to clipboard

Formula of get_AICc

Open hayato-n opened this issue 2 years ago • 4 comments

In diagnostics.py, get_AICc formula is https://github.com/pysal/mgwr/blob/2a955355e31fe4a124b49ce4723335d308ee09d2/mgwr/diagnostics.py#L11-L30

However, as written in its comment, it depends on the setting of GLM. Thus its result is different from the AICc definition described in Li et al. (2019). (Even if I changed sigma2_v1 parameter of gwr.GWR), the resulting AICc value did not change.)

I suspect that the following code is consistent with the definition above.

gwr.n * (np.log(np.sum(np.square(gwr.resid_response))) - np.log(gwr.n-gwr.ENP) + np.log(2*np.pi) + (gwr.n+gwr.ENP) / (gwr.n-2-gwr.ENP))

Is there any reason why the current implementation is employed?

  1. Li, Z., Fotheringham, A. S., Li, W., & Oshan, T. (2019). Fast Geographically Weighted Regression (FastGWR): a scalable algorithm to investigate spatial process heterogeneity in millions of observations. International Journal of Geographical Information Science, 33(1), 155–175. https://doi.org/10.1080/13658816.2018.1521523

hayato-n avatar May 30 '22 08:05 hayato-n

Hi @hayato-n. The only difference is the denominator when calculating the error variance, where here in mgwr is using the MLE (RSS/n) and in the Li paper is described using the unbiased estimator (RSS/(n-k)). I think it is more common to use the MLE one that is implemented here, so to be consistent, the later update of fastgwr uses MLE (Link).

Ziqi-Li avatar May 30 '22 18:05 Ziqi-Li

Hi @Ziqi-Li, thanks for your reply. I confirmed that the following code is consistent with the get_AICc's behaviour.

# ML
sigma2 = np.sum(np.square(gwr.resid_response)) / gwr.n

# unbiased
# sigma2 = np.sum(np.square(gwr.resid_response)) / (gwr.n - gwr.ENP)

# AICc
gwr.n * (np.log(sigma2) + np.log(2*np.pi) + (gwr.n+gwr.ENP) / (gwr.n-2-gwr.ENP))

I suspect it is not intuitive that the parameter sigma2_v1 does not affect the AICc formula. It will be more desirable if the reason is written in the comment in get_AICc or its documentation.

Thank you again for your helpful comment!

hayato-n avatar May 31 '22 03:05 hayato-n

Hi @hayato-n, great you find it consistent now. I think sigma_v1 (which is calculated based on the denominator n-k) is actually not used in the AIC formula so modifying it doesn't affect the outcome.

Ziqi-Li avatar May 31 '22 17:05 Ziqi-Li

Yes, you are right, sigma_v1 does not affect. I think your comments here are informative, thus I will send a small pull request to clarify the AICc formula. Please check and accept if you like it.

hayato-n avatar Jun 01 '22 02:06 hayato-n