openml-python issues

fix nonetype error during print for tasks without class labels

#### Reference Issue Fixes #1100 and #1058 #### What does this PR implement/fix? Explain your changes. Previously, if `class_labels` attribute of task is `None` when `__repr__` was called, a `NoneType`...

willcmartin

Set default value of avoid_duplicate_runs to false for run_model_on_task

#### Reference Issue Closes #1143 #### What does this PR implement/fix? Explain your changes. Sets the default value of `avoid_duplicate_runs` in the `run_model_on_task` function to False. When true, this option...

chadmarchand

Report minimal evaluation results in run printout

#### Description When doing print(run), the result is only some basic info like the task type ID. It shows no aggregated evaluation information. It would be nice if that could...

joaquinvanschoren

Good First Issue

Update download dataset parser for change of `minio_url` to `parquet_url`

The `minio_url` field in the dataset description URL is deprecated and will be fully replaced by `parquet_url` in the future (in the mean time, both are available).

PGijsbers

Upload datasets in parquet

2

Related to #1032, https://github.com/openml/OpenML/issues/1154. Instead of converting the datasets to arff and uploading them in that file format, datasets should be uploaded as parquet file instead. We can start with...

PGijsbers

enhancement

Data

get_metric_fn can't handle classification metrics using probabilities

7

``` import openml import sklearn import numpy as np from sklearn.impute import SimpleImputer from sklearn.tree import DecisionTreeClassifier openml.config.apikey = "KEY" # build a scikit-learn classifier clf = sklearn.pipeline.make_pipeline(SimpleImputer(), DecisionTreeClassifier()) task_id...

pseudotensor

bug

Feature request

Run

docs bug referring to get_metric_score

1

https://docs.openml.org/benchmark/#running-and-sharing-benchmarks Shows for "python code" : ``` import openml import sklearn openml.config.apikey = 'FILL_IN_OPENML_API_KEY' # set the OpenML Api Key benchmark_suite = openml.study.get_suite('OpenML-CC18') # obtain the benchmark suite # build...

pseudotensor

Documentation

Uploading datasets with string columns to openml via api fails

6

#### More informative Code to Reproduce ```python import requests import pandas as pd import openml from openml.datasets.functions import create_dataset # upload to test server openml.config.start_using_configuration_for_example() url = 'https://zenodo.org/record/3665663/files/dataset.csv?download=1' r =...

Louquinze

bug

Data

Allow prob-based metrics to work

1

#### What does this PR implement/fix? Explain your changes. Handling probabilities with some metrics #### How should this PR be tested? code here: https://github.com/openml/openml-python/issues/1139 #### Any other comments?

pseudotensor

Default `output_format` to `"dataframe"`

7

I propose that the several functions that take the `ouput_format` parameter and produce lists change their defaults to return dataframes instead of dictionaries (e.g. `list_datasets`). I think originally we had...

PGijsbers

enhancement

Feature request

Good First Issue

openml-python
openml-python copied to clipboard

Metadata

fix nonetype error during print for tasks without class labels

Set default value of avoid_duplicate_runs to false for run_model_on_task

Report minimal evaluation results in run printout

Update download dataset parser for change of `minio_url` to `parquet_url`

Upload datasets in parquet

get_metric_fn can't handle classification metrics using probabilities

docs bug referring to get_metric_score

Uploading datasets with string columns to openml via api fails

Allow prob-based metrics to work

Default `output_format` to `"dataframe"`

← Metadata

Owner

Metadata

openml-python openml-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

openml-python
openml-python copied to clipboard