autofeat
autofeat copied to clipboard
Reproducibility issue
Hello,
I noticed that results are not reproducible by using the library i.e. when using sklearn drop-down-replacement classes, they will each time produce slightly different results.
For example, when using:
features_engineer = AutoFeatClassifier()
features_engineer.fit_transform(data_train.data, data_train.target.value)
, it will calculate (or select) different features each time.
The issue above I temporarily fixed by using:
random.seed(seed)
np.random.seed(seed)
, so that the outputs produced by AutoFeatClassifier
stay constant among runs.
However, when I tried using the following:
selector = FeatureSelector(verbose=self.verbose, problem_type="classification", featsel_runs=5)
selector.fit_transform(df_indices, target)
, the above-mentioned seed setting trick didn't translate into desirable outcome - the selected features still change during runs...
Is there an easy fix to correct this? Somewhere in the source randomness must be introduced somewhere, damn.