openml-python
openml-python copied to clipboard
Python module to interface with OpenML
#### Reference Issue Fixes #1100 and #1058 #### What does this PR implement/fix? Explain your changes. Previously, if `class_labels` attribute of task is `None` when `__repr__` was called, a `NoneType`...
#### Reference Issue Closes #1143 #### What does this PR implement/fix? Explain your changes. Sets the default value of `avoid_duplicate_runs` in the `run_model_on_task` function to False. When true, this option...
#### Description When doing print(run), the result is only some basic info like the task type ID. It shows no aggregated evaluation information. It would be nice if that could...
The `minio_url` field in the dataset description URL is deprecated and will be fully replaced by `parquet_url` in the future (in the mean time, both are available).
Related to #1032, https://github.com/openml/OpenML/issues/1154. Instead of converting the datasets to arff and uploading them in that file format, datasets should be uploaded as parquet file instead. We can start with...
``` import openml import sklearn import numpy as np from sklearn.impute import SimpleImputer from sklearn.tree import DecisionTreeClassifier openml.config.apikey = "KEY" # build a scikit-learn classifier clf = sklearn.pipeline.make_pipeline(SimpleImputer(), DecisionTreeClassifier()) task_id...
https://docs.openml.org/benchmark/#running-and-sharing-benchmarks Shows for "python code" : ``` import openml import sklearn openml.config.apikey = 'FILL_IN_OPENML_API_KEY' # set the OpenML Api Key benchmark_suite = openml.study.get_suite('OpenML-CC18') # obtain the benchmark suite # build...
#### More informative Code to Reproduce ```python import requests import pandas as pd import openml from openml.datasets.functions import create_dataset # upload to test server openml.config.start_using_configuration_for_example() url = 'https://zenodo.org/record/3665663/files/dataset.csv?download=1' r =...
#### What does this PR implement/fix? Explain your changes. Handling probabilities with some metrics #### How should this PR be tested? code here: https://github.com/openml/openml-python/issues/1139 #### Any other comments?
I propose that the several functions that take the `ouput_format` parameter and produce lists change their defaults to return dataframes instead of dictionaries (e.g. `list_datasets`). I think originally we had...