Matthias Feurer comments

Results 353 comments of


                                            Matthias Feurer

Scipy sparse matrices not handled correctly by TPOT and autosklearn

Thanks for the clarification. Auto-sklearn should support sparse `X`, but we'll check, and will also check what the behavior for sparse `y` values is.

Failures on several new datasets

> Hi @mfeurer, may i ask you how did you run autosklearn against those datasets? Did you use the latest code on master? No, but I'm doing that now. 1....

Failures on several new datasets

I just tried [the new tasks](https://github.com/openml/automlbenchmark/issues/187#issuecomment-745515523) and it turns out that [task 360115](https://www.openml.org/t/360115) has several string features which cannot be handled by the benchmark code itself: ``` [ERROR] [amlb.benchmark:13:35:02.848] could...

Failures on several new datasets

And one more using task 360112 and fold 4: ``` [ERROR] [amlb.benchmark:14:20:52.275] 23 columns passed, passed data had 22 columns Traceback (most recent call last): File "/bench/venv/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 564, in...

Train/test file header may not contain all categories of a categorical variable

Thanks a lot for the quick answer. Do you by any chance know if any other dataset from the 2019 benchmark is affected by the 2nd problem?

Train/test file header may not contain all categories of a categorical variable

> Do you happen to know a dataset which has problem 1 (all-null columns) but not problem 2? No, sorry. > No, but it should be easy to write a...

Train/test file header may not contain all categories of a categorical variable

I'd be happy to have an update to openml-python. Would you like to create one or open an issue there?

Benchmark Update: Regression and more Classification!

Hi together, I am currently setting up scripts to generate meta-data and think that there is a discrepancy in the benchmark suite IDs. I'm running the following code: ``` datasets_suite_218...

Benchmark Update: Regression and more Classification!

FYI I posted an error related to the new task 360115 in issue #233 as it appears to be incompatible with the benchmarking framework.

Benchmark Update: Regression and more Classification!

> The version used in the benchmark should be publicly released (e.g. on PyPI), but it can be a development/pre-release version. We will not allow fixing versions to specific git...