openml-python
openml-python copied to clipboard
ValueError: could not convert string to float: 'f'
Description
Hello! When I try to do an exercise on "Running and sharing benchmarks" I get the same error ("ValueError: could not convert string to float: 'f'") in both jupyter notebook and colab. Does the error mean that there is an error with the Key API?
The code is the next (sorry if this is a very basic openML question.):
Steps/Code to Reproduce
import openml
import sklearn
openml.config.apikey = 'API_KEY' # set the OpenML Api Key
benchmark_suite = openml.study.get_suite('OpenML-CC18') # obtain the benchmark suite
# build a scikit-learn classifier
clf = sklearn.pipeline.make_pipeline(sklearn.preprocessing.Imputer(),
sklearn.tree.DecisionTreeClassifier())
for task_id in benchmark_suite.tasks: # iterate over all tasks
task = openml.tasks.get_task(task_id) # download the OpenML task
run = openml.runs.run_model_on_task(clf, task) # run the classifier on the task
score = run.get_metric_score(sklearn.metrics.accuracy_score) # print accuracy score
print('Data set: %s; Accuracy: %0.2f' % (task.get_dataset().name,score.mean()))
run.publish() # publish the experiment on OpenML (optional, requires internet and an API key)
print('URL for run: %s/run/%d' %(openml.config.server,run.run_id))
Expected Results
Actual Results
Versions
No, this means that there are categorical features that cannot be handled by the current pipeline. Is this an example we use in the documentations? If yes, we need to update that to work with categorical attributes.
The example in this PDF should work here.
That's right, it's an example of the documentation, here is the link: https://openml.github.io/docs/benchmark/
On Tue, Jun 15, 2021 at 8:45 AM Matthias Feurer @.***> wrote:
No, this means that there are categorical features that cannot be handled by the current pipeline. Is this an example we use in the documentations? If yes, we need to update that to work with categorical attributes.
The example in this PDF https://jmlr.org/papers/v22/19-920.html should work here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openml/openml-python/issues/1096#issuecomment-861224605, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNP2MTUHZHKYZUIHCFF23LTS3ZKPANCNFSM46VJJLWA .
Alright, can you confirm that the other snippet works for you?
Thanks a lot for the link, so this page needs to be updated. Would you like to create a PR for doing so?
Yes, sure. So that means that in the pipeline I need to use one method for categorical features and another for numerical features, right? I don't know if there is any way to find benchmarking with numerical features only.
On Tue, Jun 15, 2021 at 12:42 PM Matthias Feurer @.***> wrote:
Alright, can you confirm that the other snippet works for you?
Thanks a lot for the link, so this page https://github.com/openml/docs/blob/master/docs/benchmark.md needs to be updated. Would you like to create a PR for doing so?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openml/openml-python/issues/1096#issuecomment-861392824, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGNP2MQUWFLMDAEO2NOOTXDTS4VCXANCNFSM46VJJLWA .