automlbenchmark icon indicating copy to clipboard operation
automlbenchmark copied to clipboard

Fix Autoxgboost reader issue

Open ja-thomas opened this issue 4 years ago • 3 comments

Now merging in the correct branch.

This might fix the CI problems.

ja-thomas avatar Jan 11 '22 11:01 ja-thomas

What should we do here? autoxgboost was withdrawn from the benchmark, should we also officially remove its integration from the master branch? Because if not, then I feel that we need to continue support for people to run it.

PGijsbers avatar Aug 16 '22 08:08 PGijsbers

I'm trying to run autoxgboost on my data (input is a common CSV) and it fails with:

CalledProcessError: Command 'Rscript --vanilla -e ".libPaths('/bench/frameworks/autoxgboost/lib'); source('/bench/frameworks/autoxgboost/exec.R'); run('/input/test_data/differentiate_cancer_train.csv…

More specifically:

...
Parse with reader=readr : /input/test_data/differentiate_cancer_train.csv
Error in parseHeader(path) :
  Invalid column specification line found in ARFF header:
f_1,f_2,f_3,f_4,f_5,f_6,f_7,f_8,f_9,f_10,f_11,f_12,f_13,f_14,f_15,f_16,f_17,f_18,f_19,f_20,f_21,f_22,f_23,f_24,f_25,f_26,f_27,f_28,f_29,f_30,f_31,f_32,f_33,f_34,f_35,f_36,f_37,f_38,f_39,f_40,f_41,f_42,f_43,f_44,f_45,f_46,f_47,f_48,f_49,f_50,...

Searching around and I found out this https://machinelearningmastery.com/load-csv-machine-learning-data-weka/

Yet it's about Weka, but it make me think if my data need to be converted anyway. And now I'm wondering if this PR could help me as well.

BTW, frameworks ranger and mlr3automl failed in the same way.

alanwilter avatar Sep 17 '22 07:09 alanwilter

Could you open a new issue that specifies exactly what versions you are using (OS, Python, AMLB), the command you use to start such an experiment, the custom dataset configuration (yaml file) and, if possible, the dataset itself? It seems to try to read the CSV file as ARFF file, which is problematic since ARFF requires a header.

PGijsbers avatar Sep 19 '22 08:09 PGijsbers

Closing this PR, let's officially withdraw support for autoxgboost from the benchmark. Only if someone steps up to fix the issues and indicates the intention to maintain the integration we can reconsider.

PGijsbers avatar Mar 03 '23 13:03 PGijsbers