heamy
heamy copied to clipboard
Using different feature set for each model
It is advised to use different feature sub-sets across the models for diversity.
Is it possible using heamy?
Yes, it's possible. You can implement this logic inside your custom model or just add new datasets.
def xgboost_model(X_train, y_train, X_test, y_test=None, random_state=9999):
params = {
'objective': 'reg:linear',
'learning_rate': 0.02,
'max_depth': 20,
'subsample': 0.8,
'colsample_bytree': 0.8,
'seed': random_state,
'num_estimators': 100,
'silent': 1,
'tree_method': 'exact',
}
na_value = np.nan
# Filter columns
subset_of_columns = ['a','b','c']
X_train = X_train[subset_of_columns]
X_test = X_test[subset_of_columns]
X_train = xgb.DMatrix(X_train, label=y_train, missing=na_value)
model = xgb.train(params, X_train, params['num_estimators'], maximize=True, )
return model.predict(xgb.DMatrix(X_test, missing=na_value))
Thanks!
Adding new datasets is a painfully obvious solution.