auto-sklearn
auto-sklearn copied to clipboard
Failing configurations
-
EDIT: After some investigation, this appears to have less to do with the configurations and more with imputation. After trying to recreate the failures, it seems that
Xarrays that reach a treshold of Nans end up causing the configurations to fail. These Nan's are added randomly and so it explains the infrequency of it. -
Edit2:
"fast_ica"with"fun":"exp"fails if there is a majority of Nans in the data. -
Edit3:
"fast_ica"with"fast_ica:whiten" : "False"fails with NaN's in the input. -
Edit4:
"fast_ica"with"whiten" : "False"fails even with no NaN values present. -
Edit:5
"fast_ica"with"iris"dataset works, even with high occurence of Nan's, it seems that it is more dependant on the frequency of 0's in the dataset rather than Nan's. -
Edit6: Trying to force a certain
"feature:preprocessor:__choice__"is currently not possible. Trying to manually go in and edit the Config is not straight forward and should be approached whenConfigSpace.Configurationget's updated to allow for easier modificaiton of aConfig. See this issue ConfigSpace #205 for why it's not straight forward to delete a key and add a new one.
We leave some randomness in the configurations that get tested when testing different classifier and regressor components, these are collected here:
- Python version 3.8
- Test
test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_sparse
Configuration:
balancing:strategy, Value: 'weighting'
classifier:__choice__, Value: 'sgd'
classifier:sgd:alpha, Value: 7.27693595714389e-05
classifier:sgd:average, Value: 'False'
classifier:sgd:eta0, Value: 0.013654826040547558
classifier:sgd:fit_intercept, Constant: 'True'
classifier:sgd:learning_rate, Value: 'invscaling'
classifier:sgd:loss, Value: 'log'
classifier:sgd:penalty, Value: 'l1'
classifier:sgd:power_t, Value: 0.5468767593727824
classifier:sgd:tol, Value: 8.162675288740052e-05
data_preprocessor:__choice__, Value: 'feature_type'
data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 467
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
feature_preprocessor:__choice__, Value: 'kernel_pca'
feature_preprocessor:kernel_pca:gamma, Value: 6.985386846337043
feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
feature_preprocessor:kernel_pca:n_components, Value: 10
- Python version 3.8
- Test
test/test_pipeline/test_classification.py::SimpleClassificationPipelineTest::test_configurations_signed_data
Configuration:
balancing:strategy, Value: 'weighting'
classifier:__choice__, Value: 'lda'
classifier:lda:shrinkage, Value: 'auto'
classifier:lda:tol, Value: 0.038890093430048595
data_preprocessor:__choice__, Value: 'feature_type'
data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'minority_coalescer'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:minority_coalescer:minimum_fraction, Value: 0.001521146558163954
data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'mean'
data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'none'
feature_preprocessor:__choice__, Value: 'fast_ica'
feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
feature_preprocessor:fast_ica:fun, Value: 'exp'
feature_preprocessor:fast_ica:whiten, Value: 'False'
- Python version 3.10
- Test
SimpleClassificationPipelineTest.test_configurations_sparse
Configuration:
balancing:strategy, Value: 'none'
classifier:__choice__, Value: 'qda'
classifier:qda:reg_param, Value: 0.7722372097734942
data_preprocessor:__choice__, Value: 'feature_type'
data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'median'
data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1761
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
feature_preprocessor:__choice__, Value: 'kernel_pca'
feature_preprocessor:kernel_pca:gamma, Value: 2.351280410584469
feature_preprocessor:kernel_pca:kernel, Value: 'rbf'
feature_preprocessor:kernel_pca:n_components, Value: 10
- Python version 3.8
- Test
SimpleClassificationPipelineTest.test_configurations_signed_data
Configuration:
balancing:strategy, Value: 'none'
classifier:__choice__, Value: 'gaussian_nb'
data_preprocessor:__choice__, Value: 'feature_type'
data_preprocessor:feature_type:categorical_transformer:categorical_encoding:__choice__, Value: 'encoding'
data_preprocessor:feature_type:categorical_transformer:category_coalescence:__choice__, Value: 'no_coalescense'
data_preprocessor:feature_type:numerical_transformer:imputation:strategy, Value: 'most_frequent'
data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__, Value: 'quantile_transformer'
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:n_quantiles, Value: 1004
data_preprocessor:feature_type:numerical_transformer:rescaling:quantile_transformer:output_distribution, Value: 'normal'
feature_preprocessor:__choice__, Value: 'fast_ica'
feature_preprocessor:fast_ica:algorithm, Value: 'deflation'
feature_preprocessor:fast_ica:fun, Value: 'exp'
feature_preprocessor:fast_ica:whiten, Value: 'False'