auto-sklearn
auto-sklearn copied to clipboard
AutoMLRegressor does not support task binary
cant fit model with AutoMLRegression
from autosklearn.regression import AutoSklearnRegressor
reg = AutoSklearnRegressor(time_left_for_this_task=5*60, per_run_time_limit=30, n_jobs=8)
reg.fit(X=X_train, y=y_train)
this my log
ValueError Traceback (most recent call last)
Input In [25], in <cell line: 2>()
1 reg = AutoSklearnRegressor(time_left_for_this_task=5*60, per_run_time_limit=30, n_jobs=8)
----> 2 reg.fit(X=X_train, y=y_train)
File ~/miniforge3/lib/python3.10/site-packages/autosklearn/estimators.py:1587, in AutoSklearnRegressor.fit(self, X, y, X_test, y_test, feat_type, dataset_name)
1576 raise ValueError(
1577 "Regression with data of type {} is "
1578 "not supported. Supported types are {}. "
(...)
1582 "".format(target_type, supported_types)
1583 )
1585 # Fit is supposed to be idempotent!
1586 # But not if we use share_mode.
-> 1587 super().fit(
1588 X=X,
1589 y=y,
1590 X_test=X_test,
1591 y_test=y_test,
1592 feat_type=feat_type,
1593 dataset_name=dataset_name,
1594 )
1596 return self
File ~/miniforge3/lib/python3.10/site-packages/autosklearn/estimators.py:540, in AutoSklearnEstimator.fit(self, **kwargs)
538 if self.automl_ is None:
539 self.automl_ = self.build_automl()
--> 540 self.automl_.fit(load_models=self.load_models, **kwargs)
542 return self
File ~/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py:2394, in AutoMLRegressor.fit(self, X, y, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models)
2383 def fit(
2384 self,
2385 X: SUPPORTED_FEAT_TYPES,
(...)
2392 load_models: bool = True,
2393 ) -> AutoMLRegressor:
-> 2394 return super().fit(
2395 X,
2396 y,
2397 X_test=X_test,
2398 y_test=y_test,
2399 feat_type=feat_type,
2400 dataset_name=dataset_name,
2401 only_return_configuration_space=only_return_configuration_space,
2402 load_models=load_models,
2403 is_classification=False,
2404 )
File ~/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py:611, in AutoML.fit(self, X, y, task, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models, is_classification)
609 y_task = type_of_target(y)
610 if not self._supports_task_type(y_task):
--> 611 raise ValueError(
612 f"{self.__class__.__name__} does not support" f" task {y_task}"
613 )
614 self._task = self._task_type_id(y_task)
615 else:
ValueError: AutoMLRegressor does not support task binary
System Details (if relevant)
- 0.15.0
- Macbook Air M1
Hi @dadangsetio, we use sklearn.utils.multiclass.type_of_target
to identify the task type based on the y
you pass in. My guess is that it looks something like [0, 1, 0, 1, 1, ...]
which gets identified as a binary
classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.
Hi @dadangsetio, we use
sklearn.utils.multiclass.type_of_target
to identify the task type based on they
you pass in. My guess is that it looks something like[0, 1, 0, 1, 1, ...]
which gets identified as abinary
classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.
thank you for response @eddiebergman you are right that the content of y
is binary, so how can i solve them?
You may prefer to use probability scores from predict_proba
and use a Classifier instead of a Regressor.
If you really need to skip the type_of_target
check then you'll need to use the AutoML
class instead of the AutoSklearnRegresssor
, which is just a fancy wrapper that makes some things simpler, however depending on your use case this should be okay.
Here's a sample snippet:
from sklearn.datasets import make_classification
from autosklearn.automl import AutoML
from autosklearn.constants import REGRESSION
X, y = make_classification()
print(y) # [0, 0, 1, ...]
automl = AutoML(
time_left_for_this_task=30,
per_run_time_limit=5,
...,
)
regressor.fit(X, y, task=REGRESSION, ...)
Here's the __init__(...)
and the fit(...)
calls from AutoML
for you.
Best, Eddie
iam use sample snippet of AutoML
, but getting error like this
[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
Traceback (most recent call last):
File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 765, in fit
self._do_dummy_prediction()
File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 489, in _do_dummy_prediction
raise ValueError(msg)
ValueError: (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n self.run()\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n self._target(*self._args, **self._kwargs)\\n File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
You should use the same parameters you use when you constructed the estimator as you do in your original code, my guess is you had set the memory_limit=None
.
The issue is that there is no way to limit the memory of processes on Mac as far as I know. See https://github.com/automl/pynisher#features
The above version of pynisher
we use is actually newer and we need to update to it.
classifier = AutoSklearn2Classifier(
time_left_for_this_task=15 * 60,
per_run_time_limit=30,
memory_limit=None,
n_jobs=1,
max_models_on_disc=10,
ensemble_size=10
).fit(preprocessor.transform(train_x), train_y, preprocessor.transform(valid_x), valid_y)
There is an internal check that prohibits running without memory limit:
[ERROR] [2024-07-18 15:19:23,002:Client-AutoML(1):5923f702-4508-11ef-82ea-42442fa1d044] '>' not supported between instances of 'NoneType' and 'int'
Traceback (most recent call last):
File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/automl.py", line 680, in fit
X, y = reduce_dataset_size_if_too_large(
File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/util/data.py", line 430, in reduce_dataset_size_if_too_large
assert memory_limit > 0
TypeError: '>' not supported between instances of 'NoneType' and 'int'
It's such a shame we cannot use auto-sklearn
on Apple Silicon.. Hopefully one day you find a workaround!
Yes, it's true, I used to feel like that @ViktorooReps