scikit-lego
scikit-lego copied to clipboard
[FEATURE] Add `LinearEmbedder` model to the mix.
In this live stream I seem to be able to show that KNN can perform on par with boosted models once you improve the representation of X. A quick trick to do this is to first run a linear model and to use the trained coefficients to rescale X before indexing it.
This implementation shows the idea:
from sklearn.base import BaseEstimator, RegressorMixin
class RidgeKNNRegressor(BaseEstimator, RegressorMixin):
def __init__(self, n_neighbors=5, coef_fit=False, weights="uniform"):
self.n_neighbors = n_neighbors
self.coef_fit = coef_fit
self.weights = weights
def fit(self, X, y):
if self.coef_fit:
self.mod_ = Ridge(fit_intercept=False).fit(X, y)
X = X * mod.coef_
self.knn_ = KNeighborsRegressor(n_neighbors=self.n_neighbors, weights=self.weights).fit(X, y)
return self
def predict(self, X):
if self.coef_fit:
X = X * mod.coef_
return self.knn_.predict(X)
Instead of doing both the embedding and the KNN in one go though I think it will be nicer to split this and to have a dedicated (meta?) estimator that can add context to X.
@FBruzzesi I can pick this up, but feel free to tell me if you doubt this idea.