smt
smt copied to clipboard
KPLS benchmark on Griewank function
Hello,
I am currently working on KPLS techniques as part of my thesis. I am trying to reproduce the results established in the following article: https://hal.archives-ouvertes.fr/hal-01232938/document . The article focuses in part on applying KPLS model on the Griewank function while varying the input ranges, the number of inputs and the number of learning points.
I wrote a code to test KPLS under the same conditions as those defined in the article. Here is my code :
def griewank_function(x):
"""Griewank's function multimodal, symmetric, inseparable """
y = np.zeros(x.shape[0])
for j in range(x.shape[0]):
x_j = x[j,:]
partA = 0
partB = 1
for i in range(x.shape[1]):
partA += x_j[i]**2
partB *= np.cos(x_j[i] / np.sqrt(i+1))
y[j] = 1 + (partA/4000.0) - partB
return y
def calculate_error(sm,X_test, Y_test):
Y_predicted = sm.predict_values(X_test)
err_rel = 100*np.sqrt(np.sum(np.square(Y_predicted[:, 0] - Y_test))/np.sum(np.square(Y_test)))
return err_rel
#Parameters
X_inf = -5.0
X_sup = 5.0
dim=20
num = 300
num_test = 5000
cri='ese' #ese or c
X_lim = np.tile(np.array([X_inf, X_sup]), (dim,1))
#Calculating test points. To be commented after first iteration
sampling = Random(xlimits=X_lim)
Xt = sampling(num_test)
Yt = griewank_function(Xt)
#initializing errors
err_krg=np.array([])
err_kpls1=np.array([])
err_kpls2=np.array([])
err_kpls3=np.array([])
#Loop for computing mean and sigma
for j in range (10):
sampling = LHS(xlimits=X_lim, criterion=cri)
X = sampling(num)
Y=griewank_function(X)
print(Y.shape)
#initializing surrogate models
sm_krg = KRG(print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
sm_kpls1 = KPLS(n_comp=1,print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
sm_kpls2 = KPLS(n_comp=2, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])
sm_kpls3 = KPLS(n_comp=3, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])
#Training surrogate models
sm_krg.set_training_values(X, Y)
start = time.time()
sm_krg.train()
time_krg = time.time()-start
sm_kpls1.set_training_values(X, Y)
start = time.time()
sm_kpls1.train()
time_kpls1 = time.time() - start
sm_kpls2.set_training_values(X, Y)
start = time.time()
sm_kpls2.train()
time_kpls2 = time.time() - start
sm_kpls3.set_training_values(X, Y)
start = time.time()
sm_kpls3.train()
time_kpls3 = time.time() - start
#Calculating errors
err_rel_krg = calculate_error(sm_krg, Xt, Yt)
err_rel_kpls1 = calculate_error(sm_kpls1, Xt, Yt)
err_rel_kpls2 = calculate_error(sm_kpls2, Xt, Yt)
err_rel_kpls3 = calculate_error(sm_kpls3, Xt, Yt)
#print("smt", err_rel)
err_krg=np.append(err_krg, err_rel_krg)
err_kpls1 = np.append(err_kpls1, err_rel_kpls1)
err_kpls2 = np.append(err_kpls2, err_rel_kpls2)
err_kpls3 = np.append(err_kpls3, err_rel_kpls3)
print("error krg",np.mean(err_krg,0),"; sigma",np.sqrt(np.var(err_krg,0)),"; time krg",time_krg)
print("error kpls1",np.mean(err_kpls1,0),"; sigma",np.sqrt(np.var(err_kpls1,0)),"; time kpls1",time_kpls1)
print("error kpls2",np.mean(err_kpls2,0),"; sigma",np.sqrt(np.var(err_kpls2,0)),"; time kpls2",time_kpls2)
print("error kpls3",np.mean(err_kpls3,0),"; sigma",np.sqrt(np.var(err_kpls3,0)),"; time kpls3",time_kpls3)
I am using the exact same error definition as the one defined in the article. The error is computed over 5000 random test points. Correlation function is gaussian.
Here are the results I get (array on the left are my results and array on the right the results from the article)
My fist observation is that the error depends a lot on the DOE used for learning, that is to say, if I generate a new DOE for the same case, I will not get the same results.
In case 3, you can see that I used 2 different samples, one with ESE optimization and one with center criterion which leads to very different results. Furthermore, using KPLS with 2 or 3 components is supposed to lead to very small error (according to the array on the right).
What could be the cause of such a difference ?
I also did somme tests with [-5 5] input range. Do not hesitate to ask me for more details. Thank you in advance,
Alexandre
Hi, sorry for the late answer. I can't reproduce the results of the paper with the latest version either. As you may have guessed it is not a high priority but it should definitely be investigated by pulling older versions to pin down a change responsible for this.
I have encountered the same problem. Have you solved it
Hi, no I didn't go any further.
Thanks. I change the sampling method, which also leads to great changes in precision. Can you post this article again? The link won't open