automlbenchmark
automlbenchmark copied to clipboard
Failures on several new datasets
Hi together,
we ran Auto-sklearn on all new datasets from studies 271 and 269 and realized that there are some failures due to datasets containing strings and the unknown categories during test time:
Dataset 359948
[ERROR] [amlb.benchmark:22:48:31.978] could not convert string to float: './SAT09/CRAFTED/rbsat/crafted/forced/rbsat-v2640c305320gyes10.cnf-1-MPhaseSAT_2011.02.15'
Traceback (most recent call last):
File "/bench/amlb/benchmark.py", line 414, in run
meta_result = framework.run(self._dataset, task_config)
File "/bench/frameworks/autosklearn_1032/init.py", line 16, in run
X_enc=dataset.train.X_enc,
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 520, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 143, in X_enc
return self.data_enc[:, predictors_ind]
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 520, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 130, in data_enc
encoded_cols = [f.label_encoder.transform(self.data[:, f.index]) for f in self.dataset.features]
File "/bench/amlb/data.py", line 130, in
Similar issues in datasets 359947, 359945 and 359942.
Dataset 359991
[ERROR] [amlb.benchmark:20:26:56.796] Found unknown categories ['S'] in column 0 during transform
Traceback (most recent call last):
File "/bench/amlb/benchmark.py", line 414, in run
meta_result = framework.run(self._dataset, task_config)
File "/bench/frameworks/autosklearn_1032/init.py", line 16, in run
X_enc=dataset.train.X_enc,
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 520, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 143, in X_enc
return self.data_enc[:, predictors_ind]
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 520, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 130, in data_enc
encoded_cols = [f.label_encoder.transform(self.data[:, f.index]) for f in self.dataset.features]
File "/bench/amlb/data.py", line 130, in
Similar issue for dataset 211986
Hi @mfeurer, may i ask you how did you run autosklearn
against those datasets?
Did you use the latest code on master
?
2e41ca1 (HEAD -> master, origin/master, origin/HEAD) increase default activity timeout to allow installation of large libraries when building docker image (#222)
I just tried with a docker image I built recently for autosklearn
and it seems to work:
id task framework constraint fold result metric mode version params app_version utc duration training_duration predict_duration models_count seed acc auc balacc logloss
0 openml.org/t/359991 kick autosklearn test 0 0.780521 auc docker 0.12.0 dev [NA, NA, NA] 2020-12-14T14:49:28 696.5 613.3 2.4 5 2144378946 0.902726 0.780521 0.623346 0.371129
1 openml.org/t/359991 kick autosklearn test 1 0.784133 auc docker 0.12.0 dev [NA, NA, NA] 2020-12-14T14:59:57 629.2 606.9 2.5 4 2144378947 0.902726 0.784133 0.625739 0.355173
ValueError: Found unknown categories ['S'] in column 0 during transform
This dataset has a categorcal column represented as:
@attribute 'Trim' {'1','150','2','250','3','3 R','Adv','Bas','C','Car','CE','Cin','Cla','Cus','CX','CXL','CXS','DE','Den','DS','Dur','DX','eC','Edd','Edg','eL','Ent','ES','EX','EX-','Exe','FX4','GL','GLE','GLS','GS','GT','GTC','GTP','GTS','GX','GXE','GXP','Har','Her','Hig','Hyb','i','JLS','JLX','Kin','L','L 3','L10','L20','L30','Lar','LE','Lim','LL','LS','LT','LTZ','Lux','LW2','LW3','LX','LXi','Max','Maz','Nor','Out','Ove','OZ','Plu','Pre','Pro','R/T','Ral','Ren','RS','RT','s','S','SC1','SC2','SE','SE-','SEL','SES','Si','Sig','SL','SL1','SL2','SLE','SLT','Spe','Spo','Spy','SR5','SS','ST','Sta','STX','SV6','SVT','SX','SXT','T5','Tou','Ult','Val','VP','W/T','X','XE','XL','XLS','XLT','XR','XRS','XS','Xsp','Z24','Z71','ZR2','ZTS','ZTW','ZX2','ZX3','ZX4','ZX5','ZXW'}
meaning that here we have both 's' and 'S' as categorical values. The arff file makes the distinction but openml metadata don't... leading to your error. This has been fixed with https://github.com/openml/automlbenchmark/pull/208 Of course, the assumption here is that categorical values should be case-insensitive. I hope we don't have any examples where it shouldn't be the case.
Another issues that was occuring with https://www.openml.org/t/211986: https://github.com/openml/automlbenchmark/pull/209
All those fixes are now in master
.
Data associated to openml/t/359948
has been deactivated: https://www.openml.org/d/23701 : I can't run it anymore.
@PGijsbers is it on purpose?
Also, data associated to openml/t/359942
has a string
column: the app doesn't currently support those columns (won't be supported before https://github.com/openml/automlbenchmark/issues/116): those datasets should be removed.
What's happening with string
columns currently is that the ARFF file and/or openml don't provide the list of categories (by definition...), so the app fails when trying to label encode those.
Suggestions (sorted according to my preference):
- remove those datasets.
- translate the entire column to
nan
s for frameworks requiring numericals. - extract unique values from the data (can be huge) to allow encoding.
I am not sure why dataset 23701 is deactivated, I'll ask Joaquin.
The college dataset only has string values on attributes that should be ignored, they are marked as such on openml.
All other attributes are either nominal or numeric.
Our framework should ignore features labeled as "ignore" under the OpenMLDataset.ignore_attributes
:
>>> import openml
>>> d = openml.datasets.get_dataset(42727)
>>> d.ignore_attribute
['school_name', 'school_webpage']
Our framework should ignore features labeled as "ignore"
@PGijsbers, agree and the fix was surprisingly not trivial: https://github.com/openml/automlbenchmark/pull/224
Hi @mfeurer, may i ask you how did you run autosklearn against those datasets? Did you use the latest code on master?
No, but I'm doing that now.
-
python runbenchmark.py autosklearn openml/t/359948 -m local -f 0
gives the first error message, and the 3 other tasks I mentioned fail as well. But it appears that you have found this issue now as well and are about to fix it.
- Issues with datasets 359991 and 211986 are now fixed by using the master branch.
@sebhrusen The deactivated dataset (https://www.openml.org/d/23701) is not associated with openml/t/359948. According to Joaquin that dataset has been deactivated for years.
@PGijsbers typo on my side... I was running openml/t/259948
!
To which benchmark should that task belong? I don't see it in any of /s/218, /s/269, /s/270 and /s/271.
To which benchmark should that task belong? I don't see it in any of /s/218, /s/269, /s/270 and /s/271.
None, It was a typo!
Based on the failing tasks described in this ticket I ran openml/t/259948
instead of openml/t/359948
. And the fact that it was failing too probably confirmed me that it was the reason.
I just tried the new tasks and it turns out that task 360115 has several string features which cannot be handled by the benchmark code itself:
[ERROR] [amlb.benchmark:13:35:02.848] could not convert string to float: 'crvh'
Traceback (most recent call last):
File "/bench/amlb/benchmark.py", line 444, in run
meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
File "/bench/frameworks/autosklearn/__init__.py", line 16, in run
X_enc=dataset.train.X_enc,
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 149, in X_enc
return self.data_enc[:, predictors_ind]
File "/bench/amlb/utils/cache.py", line 73, in decorator
return cache(self, prop_name, prop_fn)
File "/bench/amlb/utils/cache.py", line 30, in cache
value = fn(self)
File "/bench/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/bench/amlb/data.py", line 136, in data_enc
encoded_cols = [f.label_encoder.transform(self.data[:, f.index]) for f in self.dataset.features]
File "/bench/amlb/data.py", line 136, in <listcomp>
encoded_cols = [f.label_encoder.transform(self.data[:, f.index]) for f in self.dataset.features]
File "/bench/amlb/datautils.py", line 247, in transform
return return_value(vec.astype(self.encoded_type, copy=False))
ValueError: could not convert string to float: 'crvh'
Excerpt from the features.xml
<oml:feature>
<oml:index>14740</oml:index>
<oml:name>Var14741</oml:name>
<oml:data_type>string</oml:data_type>
<oml:is_target>false</oml:is_target>
<oml:is_ignore>false</oml:is_ignore>
<oml:is_row_identifier>false</oml:is_row_identifier>
<oml:number_of_missing_values>36726</oml:number_of_missing_values>
</oml:feature>
<oml:feature>
<oml:index>14741</oml:index>
<oml:name>Var14742</oml:name>
<oml:data_type>string</oml:data_type>
<oml:is_target>false</oml:is_target>
<oml:is_ignore>false</oml:is_ignore>
<oml:is_row_identifier>false</oml:is_row_identifier>
<oml:number_of_missing_values>49141</oml:number_of_missing_values>
</oml:feature>
<oml:feature>
<oml:index>14742</oml:index>
<oml:name>Var14743</oml:name>
<oml:data_type>string</oml:data_type>
<oml:is_target>false</oml:is_target>
<oml:is_ignore>false</oml:is_ignore>
<oml:is_row_identifier>false</oml:is_row_identifier>
<oml:number_of_missing_values>48917</oml:number_of_missing_values>
</oml:feature>
Thanks for the report! That's an error made when uploading the dataset, those features should be nominal. We'll update.
And one more using task 360112 and fold 4:
[ERROR] [amlb.benchmark:14:20:52.275] 23 columns passed, passed data had 22 columns
Traceback (most recent call last):
File "/bench/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 564, in _list_to_arrays
columns = _validate_or_indexify_columns(content, columns)
File "/bench/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 689, in _validate_or_indexify_columns
f"{len(columns)} columns passed, passed data had "
AssertionError: 23 columns passed, passed data had 22 columns
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/bench/amlb/benchmark.py", line 444, in run
meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
File "/bench/frameworks/autosklearn/__init__.py", line 27, in run
input_data=data, dataset=dataset, config=config)
File "/bench/frameworks/shared/caller.py", line 93, in run_in_venv
target_is_encoded=res.target_is_encoded)
File "/bench/amlb/results.py", line 238, in save_predictions
df = to_data_frame(probabilities, columns=prob_cols)
File "/bench/amlb/datautils.py", line 150, in to_data_frame
return pd.DataFrame.from_records(obj, columns=columns)
File "/bench/venv/lib/python3.7/site-packages/pandas/core/frame.py", line 1786, in from_records
arrays, columns = to_arrays(data, columns)
File "/bench/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 548, in to_arrays
return _list_to_arrays(data, columns, coerce_float=coerce_float, dtype=dtype)
File "/bench/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 567, in _list_to_arrays
raise ValueError(e) from e
ValueError: 23 columns passed, passed data had 22 columns
I tried to reproduce locally using the latest version of the connector (that above was using commit 2e41ca137c5fe46308252c0527c84716badfc3cb) and got
Removing ignored columns None.
Job `local.openml_t_360112.1h1c.KDDCup99.5.autosklearn` failed with error: argument of type 'NoneType' is not iterable
argument of type 'NoneType' is not iterable
Traceback (most recent call last):
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/job.py", line 69, in start
result = self._run()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/benchmark.py", line 397, in _run
return self.run()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/benchmark.py", line 449, in run
self._dataset.release()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/data.py", line 230, in release
self.train.release()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/datasets/openml.py", line 100, in train
self._ensure_loaded()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/utils/process.py", line 521, in profiler
return fn(*args, **kwargs)
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/datasets/openml.py", line 155, in _ensure_loaded
self._load_split()
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/datasets/openml.py", line 163, in _load_split
self._prepare_split_data(train_path, test_path)
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/datasets/openml.py", line 182, in _prepare_split_data
col_selector, attributes = zip(*[(i, a) for i, a in enumerate(ds['attributes'])
File "/home/feurerm/sync_dir/projects/openml/automlbenchmark/amlb/datasets/openml.py", line 183, in <listcomp>
if a[0] not in self._oml_dataset.ignore_attribute])
TypeError: argument of type 'NoneType' is not iterable
Thanks for the report! The latter looks easy to fix, but I don't know if it reveals a different problem. I'll have a go at it tomorrow.
I also found an issue with Task 360115, with 32 GB of memory it runs OOM before even getting to the framework call for AG. It seems that the training data alone takes 5 GB of space, and is duplicated enough times in the benchmark to run OOM before calling task.fit
. Not sure if any other frameworks succeed on this dataset, but it may be too large to reasonably work on 32GB memory.
With optimization, it might be possible by ensuring that the data is in memory only a minimum amount of times, then cleaned from memory except for 1 instance which is fed into the framework fit call.
[INFO] [amlb.print:20:49:15.642] 'problem_type': 'binary',
[INFO] [amlb.print:20:49:15.642] 'target': {'classes': ['-1', '1'], 'name': 'upselling'},
[INFO] [amlb.print:20:49:15.642] 'test': {'data': '/tmp/tmppz35pulo/test.data.npy'},
[INFO] [amlb.print:20:49:15.642] 'train': {'data': '/tmp/tmppz35pulo/train.data.npy'}}
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] Traceback (most recent call last):
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] File "/repo/frameworks/shared/callee.py", line 85, in call_run
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] result = run_fn(ds, config)
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] File "/repo/frameworks/AutoGluon/exec.py", line 47, in run
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] train = pd.DataFrame(dataset.train.data, columns=column_names).astype(column_types, copy=False)
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.926] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/frame.py", line 497, in __init__
[INFO] [amlb.print:20:49:15.926]
[INFO] [amlb.print:20:49:15.927] mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 234, in init_ndarray
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] return create_block_manager_from_blocks(block_values, [columns, index])
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 1675, in create_block_manager_from_blocks
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] mgr._consolidate_inplace()
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 988, in _consolidate_inplace
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] self.blocks = tuple(_consolidate(self.blocks))
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 1909, in _consolidate
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.927] list(group_blocks), dtype=dtype, can_consolidate=_can_consolidate
[INFO] [amlb.print:20:49:15.927]
[INFO] [amlb.print:20:49:15.928] File "/repo/frameworks/AutoGluon/venv/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 1934, in _merge_blocks
[INFO] [amlb.print:20:49:15.928]
[INFO] [amlb.print:20:49:15.928] new_values = new_values[argsort]
[INFO] [amlb.print:20:49:15.928]
[INFO] [amlb.print:20:49:15.928] MemoryError: Unable to allocate 5.03 GiB for an array with shape (15001, 45000) and data type object
[INFO] [amlb.print:20:49:15.928]
[INFO] [amlb.print:20:49:15.928]
[INFO] [amlb.print:20:49:15.932]
[ERROR] [amlb.benchmark:20:49:16.114] Unable to allocate 5.03 GiB for an array with shape (15001, 45000) and data type object
Traceback (most recent call last):
File "/repo/amlb/benchmark.py", line 444, in run
meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
File "/repo/frameworks/AutoGluon/__init__.py", line 28, in run
input_data=data, dataset=dataset, config=config)
File "/repo/frameworks/shared/caller.py", line 78, in run_in_venv
raise NoResultError(res.error_message)
amlb.results.NoResultError: Unable to allocate 5.03 GiB for an array with shape (15001, 45000) and data type object
[INFO] [amlb.results:20:49:18.986] Loading metadata from `/s3bucket/output/predictions/KDDCup09-Upselling/0/metadata.json`.
[INFO] [amlb.results:20:49:20.409] Metric scores: { 'acc': nan,
'app_version': 'dev [https://github.com/Innixma/automlbenchmark, '
'autogluon-workspace, 5a1ac12]',
'auc': nan,
'balacc': nan,
'constraint': '1h8c',
'duration': nan,
'fold': 0,
'framework': 'AutoGluon',
'id': 'openml.org/t/360115',
'info': 'NoResultError: Unable to allocate 5.03 GiB for an array with shape '
'(15001, 45000) and data type object',
'logloss': nan,
'metric': 'auc',
'mode': 'aws',
'models_count': nan,
'params': '',
'predict_duration': nan,
'result': nan,
'seed': 462990052,
'task': 'KDDCup09-Upselling',
'training_duration': nan,
'utc': '2020-12-24T20:49:20',
'version': '0.0.15'}
[INFO] [amlb.job:20:49:20.410] Job local.openml_s_271.1h8c.KDDCup09-Upselling.0.AutoGluon executed in 1980.425 seconds.
[INFO] [amlb.job:20:49:20.411] All jobs executed in 1980.426 seconds.
[INFO] [amlb.utils.process:20:49:20.412] [local.openml_s_271.1h8c.KDDCup09-Upselling.0.AutoGluon] CPU Utilization: 12.5%
[INFO] [amlb.utils.process:20:49:20.412] [local.openml_s_271.1h8c.KDDCup09-Upselling.0.AutoGluon] Memory Usage: 37.2%
[INFO] [amlb.utils.process:20:49:20.412] [local.openml_s_271.1h8c.KDDCup09-Upselling.0.AutoGluon] Disk Usage: 1.3%
[INFO] [amlb.benchmark:20:49:20.412] Processing results for
[INFO] [amlb.results:20:49:20.428] Scores saved to `/s3bucket/output/scores/AutoGluon.task_KDDCup09-Upselling.csv`.
[INFO] [amlb.results:20:49:20.440] Scores saved to `/s3bucket/output/scores/results.csv`.
[INFO] [amlb.results:20:49:20.451] Scores saved to `/s3bucket/output/results.csv`.
[INFO] [amlb.benchmark:20:49:20.458] Summing up scores for current run:
id task framework constraint fold metric mode version params app_version utc duration models_count seed info
0 openml.org/t/360115 KDDCup09-Upselling AutoGluon 1h8c 0 auc aws 0.0.15 dev [https://github.com/Innixma/automlbenchmark, autogluon-workspace, 5a1ac12] 2020-12-24T20:49:20 1980.4 462990052 NoResultError: Unable to allocate 5.03 GiB for an array with shape (15001, 45000) and data type object
python runbenchmark.py constantpredictor openml/t/360115
works for me. While it is slow (~8 minutes), it does not run into memory issues. The ARFF file is only 1.7GB. I did notice the feature types were not marked correctly on openml, so we made openml/t/360116
, but this does not fix the issue. We'll look into improving the data loading after our break.