pyKriging icon indicating copy to clipboard operation
pyKriging copied to clipboard

It starts to predict the same value as I increase sample number to predict

Open matteoottaviani opened this issue 6 years ago • 3 comments

Hi, sorry for bothering you. I have been dealing with the following problem for 3 months, hence I decided to try to share it.

I am training kriging and xgboost on sets of increasing sample number ( say from 100 samples to 1000 samples) of 15 inputs and 1 output each sample and I use the trained functions to predict a test set (that is always the same).

Whilst I haven't ever had any problem with xgboost prediction, when I predict the test set with kriging, I have no problems up to 400ish samples training; the more I increase from say around 400 samples to train, the more the last values of the test set which I predict equal the same value. Have you got any idea about that? Thanks! Matteo

matteoottaviani avatar Mar 25 '19 18:03 matteoottaviani

It's really tough to say based on this information. Can you share the data?

capaulson avatar Apr 02 '19 05:04 capaulson

Thank you very much for your answer. Yes, I can share, of course.

samplesize=1000 testsize=100 mc=0 X,OB = pickle.load( open('BH_DATA/sample'+str(samplesize)+'OB_'+str(mc)+'.pkl', 'rb') ) Xt,OBt = pickle.load( open('BH_DATA/testsample'+str(testsize)+'OB_'+str(mc)+'.pkl', 'rb') )

I train on X (1000 input combinations) and three different model outputs OB[:,s], for s=[0,1,2], and I test on Xt with the three model outputs OBt[:,s]

Up to X=X[:400] OB=OB[:400] any values predicted is different one another and it seems to work quite well; from X=X[:500] OB=OB[:500] on it predicts increasingly more identical values at the bottom of the prediction lists of 100 test samples.

thank you. mat

smple+test set.zip

matteoottaviani avatar Apr 03 '19 14:04 matteoottaviani

Hi Matteo, Capaulson

I am trying to use a dataset with ~7000 samples having 8 Xs and 1 Y. I can train the dataset but I am not sure how to save the model. I went through the scripts but don't see any option to save the model. If I use regression kriging, I don't the plot function or save figure function. So, how do I actually plot the actual vs predicted values of Y or even the error in them?

I think I will need to do the plotting outside the standard PyKriging library by extracting the predicted Y if using regression kriging.

What I am essentially trying to understand is:

  1. Is there a way to save the model?
  2. Which value in the code gives Y-predicted?

Appreciate your insights.

Best Regards, Manish

mjoshii avatar Oct 28 '20 11:10 mjoshii