hyperopt Getting different result when train a model again with Best selected hyperprameter

Hi Hi,

I am using hyperopt.fmin to find the best hyperparameter. I've used PCA for feature selection and SVM for classification and pipeline them :

def getModelInstance(self):

    model = pipeline.Pipeline([
        ('scaler', preprocessing.StandardScaler()),
        ('reducer', decomposition.PCA(random_state=1)),
        ('estimator', SVC(  kernel='linear', class_weight='balanced', probability=True))                
    ])
    return model

then I did cross-validation with 3-fold :

train_loss = model_selection.cross_val_score(model, x_train, y_train,
cv=3, scoring= 'neg_log_loss')

and then I apply hyperopt-fmin to find the best hyperparameters (n_components for PCA and C for SVM) which have a minimum neg_log_loss.

But I have a question here which is when I get my best C and n_components and then I set them to the model and train it again, the neg_log_loss is not the same as I got before (in hyperopt.fmin result). for example:

I got my best hyperparameter:

n_components= 41 C= 0.485879250931 log_loss= 0.581865148897

I set the parameter here:

   model = pipeline.Pipeline([
        ('scaler', preprocessing.StandardScaler()),
        ('reducer', decomposition.PCA(n_components= 41,random_state=1)),
        ('estimator', SVC(  C=0.485879250931 ,  kernel='linear', class_weight='balanced', probability=True))                
    ])
    return model

and do 3-fold cross-validation but I get the different neg_log_loss and don't know why?

May 16 '18 15:05 Marjaneh-T

Looks like there could be some differences when you shuffle the data in the SVC? set the random_state parameter for SVC = 1 as well in both the models , i think it will be the same neg log loss then

Jun 25 '18 15:06 aditsanghvi94

This issue has been marked as stale because it has been open 120 days with no activity. Remove the stale label or comment or this will be closed in 30 days.

Aug 19 '24 01:08 github-actions[bot]