Request timeouts: https://www.openml.org/api/v1/json/data/features/23383
Trying to access https://www.openml.org/api/v1/json/data/features/23383 and it just timeouts, never to return anything. Accessing other URIs works well.
One more example: https://www.openml.org/api/v1/json/data/features/41147
Another example: https://www.openml.org/api/v1/json/data/features/42435
Another one: https://www.openml.org/api/v1/json/data/features/42706
And: https://www.openml.org/api/v1/json/data/features/42708
And: https://www.openml.org/api/v1/json/data/features/43034
Update: this seems to be caused by bad feature types. Some are large datasets that have row id's and other numeric values (e.g. lat-long values, dates, ...) encoded as categories (with a lot of values). The server returns the full list of categories in the feature list, hence this takes an insane amount of time and resources.
Best thing to do is probably to manually fix the encoding in the ARFF file and re-process the datasets. If there are other suggestions, please let me know.
There are new ones like: https://www.openml.org/api/v1/json/data/features/44538
Thanks, we'll look into these.
@joaquinvanschoren Actually, after a lot more testing I figured out I was wrong. The code I was using had a relatively small timeout (1 minute) and these took close to two minutes to load. Sorry for the confusion and thank you for the response. I really like OpenML. I appreciate everything you're doing.
Great to hear! There are still a few that fail, mainly datasets with huge numbers of features. We might opt to resolve this in the new REST API, that we hope to deploy late this summer.