Matthias Feurer
Matthias Feurer
That sounds like a good proposal.
Hey, I just checked https://github.com/openml/docs/pull/15/files and the links aren't working yet. Does there need to be a release of the server for them to appear?
The documentation PR https://github.com/openml/docs/pull/15 just got merged. @joaquinvanschoren who can update the new API docs to add a link to the new description?
Interesting, this is only an issue on the python side, not the command line. I will investigate this further. However, shouldn't the md5 hashes **never** change for the other datasets?
Hm, I think at least dataset 299 is not stored and sent in UTF-8, although it is advertised as such: ``` wget https://www.openml.org/data/v1/download/52200/libras_move.arff file -i libras_move.arff libras_move.arff: text/plain; charset=iso-8859-1 ```...
Hey together, Here are a few comments from my side: > So, should I fix the encoding (and update the hash in the file table) of all dataset files with...
I totally agree that every file should be UTF-8 and that they're ideally converted on upload. @joaquinvanschoren you pointed to the wrong Jan. You wanted to mention @janvanrijn
I just had a look at the attributes of the dataset, and while the test obviously works, I'm wondering whether a few other attributes should also be excluded from checking,...
Is the maximum URL length documented somewhere? Also, what's the server response? If the server response is something we can parse/display nicely, is there a reason to act upon this...
This can be reproduced with: ```python import openml openml.datasets.list_datasets( data_id=list(range(10000)) ) ``` @janvanrijn we can add a helper to either call the get or post request depending on the length...