Expectations in kernel weights
As I am playing with gwlearn and kernels over there, compared to mgwr, I noticed that kernels in libpysal behave unexpectedly (besides an actual bug in #789). What I would assume that we use the kernel function, say Gaussian, to assign weights from 1 at distance 0 progressively to 0. But that is not the case. The progression of a Gaussian kernel goes from 0.4 as shown in the fig below (bandwidth 100). The same happens in GeoDa.
However, if I use the kernel as defined in mgwr, it behaves as I expected.
def _gaussian_mgwr(distances, bandwidth):
zs = distances / bandwidth
return numpy.exp(-0.5 * ((zs)**2))
The reason I don't understand why we should not start at 1 is that if I get a kernel-based W, and want to assign self weight afterwards, I am pretty much without options of doing that correctly. The value for 0 is 0.3989422804014327. Though I would intuitively expect it to be 1.
Similar behaviour, albeit not that drastic, is in a few of the other kernels.
The formula for Gaussian we use is this:
def _gaussian(distances, bandwidth):
u = distances / bandwidth
return numpy.exp(-((u / 2) ** 2)) / (numpy.sqrt(2 * numpy.pi))
Why is it divided by (numpy.sqrt(2 * numpy.pi))? If we don't then it matches MGWR and also the Gaussian kernel definition in support vector machine and elsewhere.
Can someone shed some light on this? It follows the formula I've found in @sjsrey's Modern spatial econometrics in practice but have no clue where that comes from and why it looks like this.
Just making some notes here, primarily for myself to wrap my head around this.
Over in https://github.com/uscuni/gwlearn/issues/3#issuecomment-2920584059 @TaylorOshan mentions that we need to take a square root of weights before passing onto sklearn. Which applies to the formula in MGWR. If we do sqrt of that, we get exactly the same curve as we would get in libpysal if we excluded the division.
def _libpysal_no_division(distances, bandwidth):
u = distances / bandwidth
return numpy.exp(-((u / 2) ** 2))
def _mgwr(distances, bandwidth):
zs = distances / bandwidth
return numpy.exp(-0.5 * ((zs)**2))
x = numpy.linspace(10, 500)
plt.plot(x, _libpysal_no_division(x, 100), label="lib_no_division")
plt.plot(x, numpy.sqrt(_mgwr(x, 100)), label="sqrt of mgwr", linestyle="--")
plt.plot(x, _mgwr(x, 100), label="mgwr")
plt.legend()
The formula for Gaussian we use is this:
def _gaussian(distances, bandwidth): u = distances / bandwidth return numpy.exp(-((u / 2) ** 2)) / (numpy.sqrt(2 * numpy.pi)) Why is it divided by
(numpy.sqrt(2 * numpy.pi))? If we don't then it matches MGWR and also the Gaussian kernel definition in support vector machine and elsewhere.Can someone shed some light on this? It follows the formula I've found in @sjsrey's Modern spatial econometrics in practice but have no clue where that comes from and why it looks like this.
The value for 0 is 0.3989422804014327 because that is the value of the gaussian kernel at $z=0$: with $K(z) = \frac{1}{\sqrt{2\pi}} e^{\frac{-z^2}{2}}$. In other words $0.3989 \approx \frac{1}{\sqrt{2\pi}}$. This is a normalized kernel, whereas omitting the division by numpy.sqrt(2*numpy.pi) gives an unnormalized kernel.
I think I got that far in my understanding. What I don't get is why would I want the normalised option.
Shall we maybe add a keyword to build_kernel controlling this normalisation?
There are different ways of normalizing kernels. Some normalize the kernel values so that K(0)=1. Others normalize the kernel so that integral(K(X))=1 over the domain of the kernel. Sometimes, we call the latter "proper" kernels in Bayesian computation, since they're "proper" probability distributions. See, for instance, this list.. If we were opinionated, we'd standardize the kernels so that the integral is always 1, or that K(0)=1.
I should say—we're not opinionated, so we have kernels with varying K(0)=1 and integral(K(X)) = 1.
Adding a keyword arg would increase the clarity and allow for different use cases (i.e., kernels as proper pdfs, versus unnormalized kernels).
Agreed to schedule a sprint on this.
Relevant discussion in https://github.com/pysal/mgwr/issues/157