aeon
aeon copied to clipboard
[ENH] make fit_predict_default configurable
Describe the feature or idea you want to propose
currently fit_predict makes estimates on train data by default through cross validation. It hard codes the number of folds to 10 or the minimum number of cases in one class . I would like be able to set this to something other than 10, not immediately sure the best way of configuring this.
It also always fits the whole model. I'd like to be able to turn that off.
The context is using fit_predict to score channels for channel selection. Would like it to be fast, so want 3x CV and not to build the whole model
Describe your proposed solution
mocked up fit.
n_channels = X.shape[1]
scores=np.zeros(n_channels)
# Evaluate each channel with the classifier
for i in range(n_channels):
preds=self.classifier.fit_predict(X[:,i,:],y)
scores[i]=accuracy_score(y,preds)
# Select the top n_keep channels
sorted_indices = np.argsort(-scores)
n_keep = math.ceil(n_channels * self.proportion)
self.channels_selected_=sorted_indices[:n_keep]
Currently this builds 11 models per channel, assuming each class has at least 10 cases
def _fit_predict_default(self, X, y, method):
# fit the classifier
self._fit(X, y)
# predict using cross-validation
cv_size = 10
_, counts = np.unique(y, return_counts=True)
min_class = np.min(counts)
if min_class < cv_size:
cv_size = min_class
if cv_size < 2:
raise ValueError(
f"All classes must have at least 2 values to run the "
f"_fit_{method} cross-validation."
)
random_state = getattr(self, "random_state", None)
estimator = _clone_estimator(self, random_state)
return cross_val_predict(
estimator,
X=X,
y=y,
cv=cv_size,
method=method,
n_jobs=self._n_jobs,
)
could do it with kwargs for fit_predict maybe?
for i in range(n_channels):
preds=self.classifier.fit_predict(X[:,i,:],y, **{"cv_size":3,"full_model":False})
Describe alternatives you've considered, if relevant
could I set it in the constructor or pass as an explicit parameter with default 10