machine-learning-articles Stacking Ensemble Machine Learning With Python

Stacking Ensemble Machine Learning With Python

Open khuyentran1401 opened this issue 4 years ago • 0 comments

TL;DR

Stacking or Stacked Generalization is an ensemble machine learning algorithm, using a meta-learning algorithm to learn how to best combine the predictions from two or more base machine learning algorithms

Article Link

https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python/

Author

Jason Brownlee

Key Takeaways

Stacking combines well-performing models on a classification or regression task and make predictions that have better performance than any single model in the ensemble.
Compare between different machine learning models and choose the best model

Useful Code Snippets

# compare standalone models for binary classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from matplotlib import pyplot
 
# get the dataset
def get_dataset():
	X, y = make_classification(n_samples=1000, n_features=20, n_informative=15, n_redundant=5, random_state=1)
	return X, y
 
# get a list of models to evaluate
def get_models():
	models = dict()
	models['lr'] = LogisticRegression()
	models['knn'] = KNeighborsClassifier()
	models['cart'] = DecisionTreeClassifier()
	models['svm'] = SVC()
	models['bayes'] = GaussianNB()
	return models
 
# evaluate a given model using cross-validation
def evaluate_model(model):
	cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
	scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1, error_score='raise')
	return scores
 
# define dataset
X, y = get_dataset()
# get the models to evaluate
models = get_models()
# evaluate the models and store results
results, names = list(), list()
for name, model in models.items():
	scores = evaluate_model(model)
	results.append(scores)
	names.append(name)
	print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
# plot model performance for comparison
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()

Useful Tools

Comments/ Questions

Apr 12 '20 01:04 khuyentran1401

machine-learning-articles machine-learning-articles copied to clipboard

Stacking Ensemble Machine Learning With Python

TL;DR

Article Link

Author

Key Takeaways

Useful Code Snippets

Useful Tools

Comments/ Questions

machine-learning-articles
machine-learning-articles copied to clipboard