heamy icon indicating copy to clipboard operation
heamy copied to clipboard

Using different feature set for each model

Open RaduStoicescu opened this issue 7 years ago • 2 comments

It is advised to use different feature sub-sets across the models for diversity.

Is it possible using heamy?

RaduStoicescu avatar Jun 10 '17 10:06 RaduStoicescu

Yes, it's possible. You can implement this logic inside your custom model or just add new datasets.

def xgboost_model(X_train, y_train, X_test, y_test=None, random_state=9999):
    params = {
        'objective': 'reg:linear',
        'learning_rate': 0.02,
        'max_depth': 20,
        'subsample': 0.8,
        'colsample_bytree': 0.8,
        'seed': random_state,
        'num_estimators': 100,
        'silent': 1,
        'tree_method': 'exact',

    }

    na_value = np.nan
    
    # Filter columns 
    subset_of_columns = ['a','b','c']
    X_train = X_train[subset_of_columns]
    X_test = X_test[subset_of_columns]

    X_train = xgb.DMatrix(X_train, label=y_train, missing=na_value)
    model = xgb.train(params, X_train, params['num_estimators'], maximize=True, )
    return model.predict(xgb.DMatrix(X_test, missing=na_value))

rushter avatar Jun 10 '17 10:06 rushter

Thanks!

Adding new datasets is a painfully obvious solution.

RaduStoicescu avatar Jun 10 '17 17:06 RaduStoicescu