mgwr
mgwr copied to clipboard
index error in gwr prediction
Vinayaraj Poliyapam writes:
Thanks a lot for the pysal mgwr work!
I was using GWmodel in R earlier. I tried to use your module in python. I can fit the model, but if I facing problem while predicting when I use more samples than I used for training.
I get the following error.
IndexError: index 3699 is out of bounds for axis 0 with size 3699
Any comments on this greatly appreciated.
Hi Vinayaraj,
I'll need minimum working example if I can help you.
Please send me the data you're using & the code you're running.
In my case. The trained records number (X_train) must be more than tested records (X_test) at least 51:49 for X_train:X_test respectively. The problem is when we apply this algorithm on raster data the number of the pixel will be more than the trained records in the model memory in all cases.
This is the default as you know: X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, test_size=0.25)
No problem here: X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.51, test_size=0.49)
But the problem will appear with: X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.49, test_size=0.51)
I'm having the same issue. I can reproduce it with this code
import numpy as np
from mgwr.gwr import GWR
cal_coords = np.random.randn(10,2)
cal_y = np.random.randn(10,1)
cal_X = np.random.randn(10,2)
pred_coords = np.random.randn(20,2)
pred_y = np.random.randn(20,1)
pred_X = np.random.randn(20,2)
model = GWR(cal_coords, cal_y, cal_X, 7)
gwr_results = model.fit()
pred_results = model.predict(pred_coords, pred_X)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-1-1d157a7d4a7a> in <module>
14 gwr_results = model.fit()
15
---> 16 pred_results = model.predict(pred_coords, pred_X)
~/github/pysal/mgwr/mgwr/gwr.py in predict(self, points, P, exog_scale, exog_resid, fit_params)
411 self.bw,
412 points)
--> 413 gwr = self.fit(**fit_params)
414
415 return gwr
~/github/pysal/mgwr/mgwr/gwr.py in fit(self, ini_params, tol, max_iter, solve, searching)
353 max_iter, wi=wi)
354 params[i, :] = rslt[0].T
--> 355 predy[i] = rslt[1][i]
356 w[i] = rslt[3][i]
357 S[i] = np.dot(self.X[i], rslt[5])
IndexError: index 10 is out of bounds for axis 0 with size 10
I tested this on the latest version from github as of 2019-02-08
commit 3bdfdf275716aefef4561decee6ec078da4259d4
Merge: f77e334 5a49150
Author: Wei Kang <[email protected]>
Date: Fri Jan 4 21:03:43 2019 -0800
Merge pull request #48 from pysal/version-bump
update version in __init__.py
@ljwolf I exactly have the same problem. This issue happens when the size of the test data(for prediction) is more than the train data(which is used for fitting). I hope the developers solve the problem soon.
I noticed variables 'self.P' didn't used in 'self.predict()' and I rewrited the function predict(). It can run but I don't know if results is right. Attached zip file is the changed codes. gwr.zip
I'm receiving the same index error when the size of my train data is smaller than my test/predict data. I can overcome this by subsetting my test/predict data and iterating through, but unsure if this is the best way to do it?
I have to resample my train data for accomplishing both consistency . What problems does it cause to the result?
Is there an effective solution?
The problem arises since the prediction wants to fully reuse the fitting function for training, which unfortunately triggers the part that iterates all training data (using the index of the test data ...) to calculate the fitting performances. The solution is similar to what was proposed by @WilliamZcy, but there is no need to add additional "self.P" as it is already correctly called in "predictions()" through the line "P = self.model.P". Below is my quick remedy for this issue. I have tested that the results are consistent with the original version.